Hi,
I can see that the previous developer did some pretty nice work with the lxml module. I think going forward, I'd like to add a dependency on BeautifulSoup ([login to view URL]), which is ideal for working with mal-formatted HTML. This work will be somewhat tricky because the HTML files are not tagged in any meaningful way. There are no CSS classes or other identifiers to use as reference points, so the script could be very fragile, breaking with even a minor change in the SEC format such as the introduction of another table column for additional horizontal whitespace. That being said, I can use the row names as a decent reference point in case the tables or rows are re-arranged.
Thanks!
Dan