I need a scraper written in PHP with mySQL support that will be able to run on a CRON job so that it is harvesting emails from Craigslist 24/7. Here are the requirements:
- Harvest from all sub-categories in the community, gigs, housing, jobs, personals, sales, and services sections for ALL cities (including international)
- Organize data in the database with tables for email address, harvested category, and date harvested
- Remove any duplicate emails
- Some sort of 'multithreaded' PHP solution for fastest possible harvesting
- SOCKS/HTTP proxy support
In plain and simple concept, I want a fast PHP application that will allow me to every available email on Craigslist and categorize them in a database accordingly.