Skis / Toys / Fun

Appeal to greatness not guilt

Skis / Toys / Fun

Entries from March 2008

Love/Hate with python

March 28th, 2008

Hate — some of the modules, gadzooks people get a clue:

html2text uses the SGML parser, so “normal” HTML will fail

After fixing it up to use HTMLParser .. it still sucks!  HTML parser is built using regular expressions, well guess what… bad HTML still fails.
Love — Had a HTML=>Text converter laying around in C++ took all [...]

Tags:

Iterative programming … and clustering

March 27th, 2008

The good part about writing everything in a simple language, it that it makes iterative programming easy…  Time to fix and other such fun issues would be a pain in the but if I was doing all of my Clustering 101 development work in C++, though it might reduce the run time down.
That said, finally [...]

Tags:

Clustering …

March 26th, 2008

I’ve been playing with clustering my email, just a sample set of 300 or so messages.  It’s been a while since I’ve done any “NLP” work and it’s really quite fun.
Some learnings:
As the dimentionality of space increases everything starts to sit at the origin:

initially you might have 120 unique words in an email message, some [...]

Tags:   ·

Long days and quiet mornings

March 11th, 2008

Really just a blog post to test my RSS feed… but it’s a simple thought.

Tags:

Good programmers are lazy programmers

March 3rd, 2008

Good programmers enjoy working, they just don’t like re-inventing the wheel.  There is a whole class of programmers who suffer from NIH (not invented here) who then dutifully re-invent, re-write, re-build a system.  Some of them are even so “wise” to re-use standard nomenclature in their re-inventions that when they present the XYZ project they [...]

Tags: