I am looking for a developer highly experienced in Google scraping to build a desktop Google scraper, which takes and Googles a set of keywords and returns the scraping result in an MS Word file.
An important part of the scraper is a Google request timer. A keyword set will have up to 4 keywords, and in Googling a keyword set the distance between the keywords (i.e., Google requests) will be a random number between 1 ~ 2 seconds, to emulate human clicks. The timer sets a limit on the requests in terms of the number of requests per unit times and/or minimum distance between keyword sets, and issues a warning when the limit is breached. This limit will be the maximum rate of Google requests that can be made without getting a captcha. I would expect your advice on setting it. We may experiment a bit as well.
• The scraper should submit the requests exactly as human clicks do. So approve it only if it can do requests without a captcha at a rate very similar to human clicks can.
• Preferably, the Googling should go on in the background and cause no interruption to user’s desktop work.
• Each result in the Word file should have a clickable link and be formatted (bolds, colors, etc) for easy viewing.
• Google changes the markup format once or twice a year, upon which I would need you to fix the scraper very quickly. We will have a separate contract for this.
In fact, a robust Google scraper is already available at Compunect: [url removed, login to view] The difference is that Compunect usesxies and a powerful server, while my scraper only uses one IP and a PC. Also, the scraping contents are not exactly the same. They also publish a fix whenever Google changes the format. Using Compunect codes will make your job much easier.
About the startup project
This job is the first step of my startup project Engle, and I am very much hoping to find through it a long-term technical partner. In an effort to convince you that this is a project you can be proud of, I present a brief intro.
I am an Oxford-trained linguist and have invested many years into building Engle. Engle is a Google-based language tool that helps non-native English speakers write correct English. When non-native speakers are stuck with their English, the first thing they do is Google it. Google is currently the most powerful English reference tool. However, Google is not convenient for this purpose because this is not what Google is built for. Engle optimizes Google for English reference and learning, towards creating an English reference tool that is much more powerful than Google itself.
[Removed for encouraging offsite communication which is against our Terms and Conditions -Section 13:Communication With Other Users]
One-minute demo of Engle ([url removed, login to view])
1. What experience do you have with Google scraping?
2. How would you set the timer limit? How many four-keyword sets do you think can be Googled within a minute and within an hour without a captcha?
3. Can you implement all the features I have listed above, or are there any issues? Details appreciated.
4. How quickly will you be able to fix the scraper when Google markup changes, and how much will each fix cost?
5. What is your main coding language? Can you code in Python?
23 freelancers are bidding on average $483 for this job
Hi I can build a desktop tool that fetches the data. can work on a demo if you like Thanks Relevant Skills and Experience . Proposed Milestones $833 USD - .
Hi. I'm a professional software developer. I have a lot of experience in web scraping using Python. Relevant Skills and Experience Web Scrapping Python Google Scraping Proposed Milestones $350 USD - final