I gang

Automated server-based web scraping application to develop

Developer needed to develop an "intelligent" server based automated web scraping application which

can identify from a large list of website URLs (over 200k), business websites from non-business websites.

(a business website is a website which belongs to a business providing services)

The proposed way to do this is to

1) develop a server-based application which will have the following instructions:

a) verify whether the URL corresponds to an active website

b) browse the website and identify "intra site" links (internal links)

c) determine whether the text of the link includes a particular keyword (from a pre-determined set of keywords - such

as "about us", "services", "company", "clients"...)

for example: www. website .com/[url removed, login to view] - this link will give a "positive" result since the word

"services" appears in the link. (the word "services" would have been pre-determined by the user)

2) a web interface with the following user features:

from the web interface, the user must be able to:

- upload a list of URLs to scrape (up to 200k or more if possible)

- add keyword/remove keyword

- start the "mining" process, pause it, stop it, resume it

A real-time count of URLs processed with count of active websites, positive results, negative results - needs

to be displayed.

- download the URL list of active websites, positive-identified websites and negative ones

IMPORTANT NOTES:

The application needs to be multi-threaded efficient for max processing speed

PLEASE ONLY BID IF YOU ARE THE DEVELOPER. (NO AGENCIES PLEASE)

PLEASE INDICATE IN PMB WHAT DEVELOPMENT LANGUAGE YOU INTEND TO USE

Thanks for your bid

Færdigheder: ASP, Datasøgning, MySQL, PHP, Web Skrabning

Se mere: www site real develop com, what is a web development company, web services company, web services application development, web scraping process, web scraping c#, web scraping business, web scraping application, web development language, web development agencies, web developer resume, web developer language, web developer features list, web and application developer, use html for web development, server resume, scraping the web, scraping com, scraping a server, php develop company, keywords for website development company, is php web scraping, download the web development, download develop html, developer in php to develop website

Om arbejdsgiveren:
( 105 bedømmelser ) London, United Kingdom

Projekt-ID: #2505024

Tildelt til:

luisurraca

I would like to work on this project. Planning on using Ruby on Rails and MySQL for the web server and Nokogiri (very popular Ruby gem for web scrapping). I would use background jobs so the application is usable during Mere

$720 USD in 7 dage
(2 bedømmelser)
1.6

5 freelancers are bidding on average $624 for this job

mantislin

Hi sir, please check PM, thx Kimi.

$750 USD in 6 dage
(184 bedømmelser)
6.9
zk230182

Hi I hope you'll be fine. I've studied your project specifications and I'm ready to provide you solution that fits our requirement.I am best in coding. I will do my best to make your project more effective. I will giv Mere

$750 USD in 7 dage
(45 bedømmelser)
6.5
phpXpertbd

I worked on many similar projects, I have big experience in data mining projects. I can finish this task in short time, with the best quality.

$750 USD in 15 dage
(30 bedømmelser)
6.2
aoefmpes

pl check your inbox

$350 USD in 10 dage
(44 bedømmelser)
5.1
johnrio

Let us get this done for you

$550 USD in 10 dage
(21 bedømmelser)
3.7