2004 03 25
Advisare-0.02 is out.
Please get it at Sourceforge as usual. I’ve completely rewritten the source code, added better documentation, demonstration files. The architecture is now object-based and, hopefully, the next release will be fully object-oriented.
Popularity: 1% [?]
Add comments

March 27th, 2004 at 13:49
Why not subclass SGMLParser for parsing the HTML code? Also, urllib allows you to grab the HTML directly from the website.
I’ve posted some code that extracts links from Google News. I would contribute in SourceForge, but I’m busy with another project. You can get some idea from my GoogleNewsParser code, though.
http://coding.mu/archives/2004/03/27/google_news_parser_in_python
March 29th, 2004 at 14:46
I am not too keen to use SGMLParser because the HTML I get from Canal Satellite is really badly formed and it would be a pain to process it if is not not tidied beforehand.
And, of course, as soon as it is tidied, I can use a proper XML parser (like expat) and it makes my code easier to write and to maintain.
The difficult part is the second part of the program. I am thinking about reading the XML produced and build a web of objects with semantic links between them (like same category, same duration, same channel…).
Then using a proper graph traversal algorithm, the application should be able to propose a TV program based on what the user has seen before.
March 30th, 2004 at 17:11
Uhm, this is getting interesting. Maybe I should contribute some code after all