This is a PHP 4 job. The bidder should have experience scraping sites with cURL and Regex and be able to write clean tight PHP.
I occassionally need to harvest information from websites (usually becuase they do not provide search functionality.)
This time I need to pull out information from a DMOZ-like directory. I need to collect details about each of the thousand-or-so websites listed in the directory.
To avoid namespace issues, I want a base parsing class using cURL and then a derived class to scrape this specific site.
This is not a database job. Information will simply be harvested into an array and saved to an XML file.
I have experience with cURL and using extended classes with PHP. I am familiar with regular expressions enough that if the section of this site that contains information that you need collected will follow somewhat of a format, it will not be a problem.
Dear Sir! We are an efficient and dedicated team of professionals. We offer our large experience and professionalism to make all qualitatively. We provide post-developing support until all Your requirements are completed. 100% satisfaction guaranteed. thanks