Hi, I'm looking for a python programmer to help optimise a script, which parses text content from visited websites and substitutes specific data with new data achieved from a MySQL database. It is a relatively small project: I already have a working script in python, however as I'm not a professional programmer it needs some adjustment in order to work well.
The script takes all text-content of a visited website. Compares the words with a dictionary and replaces all words which are on the site and in the dictionary. The dictionary is very large and based on a MySQL-database. I tried different solutions (fx BeautifulSoup) for web-parsing, but as the script have to be extremely solid and work on any kind of website I ended up running into troubles. Eventually I've based the script on a rather old parser [url removed, login to view] However, I'm aware that the whole script is very linear (and thus time consuming) and that one might be better off with for instance using a prefix tree (trie) instead. In the end the script must be implemented in Squid.
Timeframe: as soon as possible
Skills: Python (if you also have experience with Squid it would be great! :)
7 freelancere byder i gennemsnit $156 for dette job
Hi, I cam help you! I am a system administrator with large experience in linux, squid and python. Using html.parser python library I can do the job you want