Hate — some of the modules, gadzooks people get a clue:
- html2text uses the SGML parser, so “normal” HTML will fail
After fixing it up to use HTMLParser .. it still sucks! HTML parser is built using regular expressions, well guess what… bad HTML still fails.
Love — Had a HTML=>Text converter laying around in C++ took all of 30 minutes to wire it into python. It works…