I need some data mined off a large website.
The project must be done through proxies and over a given time.
The goal is to collect several million records to fill a leads form. This can exported into a csv file.
It will work as follows.
Step 1.) Search for leads with a minimum number of reviews, and minimum overall score.
Step 2.) If lead qualifies, collect website, phone number, city, state, description, about us.
Step 3.) Use website, if applicable, to scrape the website and find an email address.
Step 4.) Write new entry if there are no duplicate records found..
All data from step 2 is available on the same page.
When searching the website, Step 3, the scraper will look in order for, Footer, Contact Us, About Us when choosing the emails.
Of course there are going to be variations on the name, example "contact" us vs "contact", just make it work 100% of the time.
For the email address, it will take up to 5 email addresses, and be saved as email-1, email-2, etc... this is to help with the lead process in getting email address correct.
For accessing this data, it will require the use of multiple cities, and trades to search relevant people.
Before adding a new piece of data, the record has to be checked for duplicate entries, as to not write multiple records.
I'm looking to obtain 10 - 20 million records.
There must be a csv file for data dump,
as well as a individual files for each state, in case of size issues.
Please let me know what you have to offer and we can start immediately.
27 freelancere byder i gennemsnit $700 på dette job
I will write a custom script to do the scraping, are you looking for me to run the script of will you be able to run it yourself? can you let me know the website address?
HI SIR I AM SCRAPE FOR YOU FULL INFORMATION AS YOU NEEDED FROM COMPLEX WEBSITE CHARGE YOU 100$ FOR 1 MILLION CONTACTS WE DISCUSS MORE WITH SAMPLE THANKS
hello sir i have seen your project. i am a data entry and data mining expert from Bangladesh. we are a individual team working on [login to view URL] your project we can help you best about that.