Write a script to run a webcrawler from a local PC. Input file will contain: domain name to be crawled (start), and a number of keywords to be found. Input file will also contain strings that have to be in the crawled pages as well as strings that are not allowed in the output URLs. The output file should contain all URLs that can be found that contain all or some of the given keywords (max 10). The output file will also contain some easy calculations on the percentages of keywords found in the text and in the URL and a corresponding ranking (e.g. keywords_in_text: apple,bananas,tree, URL: [login to view URL] Output: [login to view URL]; Output: Keyword "tree" is not found in the text, "apple" and "banana" is found). Can be based on scrapy or something similar. Has to establish multiple connections at the same time to be able to handle a large number of crawls.
32 freelancere byder i gennemsnit €125 på dette job
Hello How are you i have full time and I can start to work immediately Please contact me and do let us discuss about your project Thanks for your posting
Hello, I am scraping expert and have completed similar projects in the past. I can help you with this project easily. I can provide more details on PM. Any questions are welcome. Thanks!
Hi, I am a python developer and I can write a web crawler script using python+beautifulsoup+requests. It will be using multiprocessing to handle a large number of operations at a time.