Lukket

News website scraper/bot, news aggregator

News Aggregator/Scraper/Bot

We need to build an infrastructure capable of scraping/indexing approximately 50-100 news websites.

The bots/scrapers should be equipped to quickly request, extract and parse news article URLs as they are published to the various news websites - in near 'real time'. The major focus of this project is to ensure that 1) News content/URLs are parsed and entered into a Database as fast as possible (within minutes of publication), and 2) that ALL news content/article URLs are captured (avoid missing content). We are looking to build a robust, easily scaled news aggregator system of bots/scrapers that can easily handle the load of 10,000 - 20,000 news articles per 24 hours, and where adding *new* sources of news (publishers, websites) is reasonably easy to do. We are open to ideas on how this is achieved.

The scraper/bot should capture the following information:

- Unique article URL

- Headline

- Body of text

- Author name

- Publisher URL (IE: [url removed, login to view])

The indexed information should be entered into an SQL database with relevant fields, unique article_ID and tables according to the above indexed information.

A list of the relevant news websites to be indexed can be obtained on request. They are primarily Australian, New Zealand, US and European news websites. All English language, all properly formatted.

We utilise Linode servers (Ubuntu OS) and operate an SQL database. We would prefer the scraper/bots to be completed in Node.js.

We have a Proxy Rotator that can manage 'requests' from the scraper/bots, so as to 'mask' multiples of visits to news websites.

*** Please note: we are seeking developers with strong experience in scraping/content extraction automation. Our primary focus is to ensure that ALL new news articles added to a news websites from the start point are captured, AND that all news articles are captured as quickly as possible. This content needs to be collected 24/7, and within 'minutes' of publication at the news website. Please only post a proposal or express interest if you believe you have the necessary skills to complete this task. ***

Færdigheder: PHP, Software Arkitektur

Se mere: software developers websites, proxy sources, publisher website, Node js website, node js project, node express, news website, news aggregator, name new zealand, load express, linode, automation bot, php open url extract, indexing url, build php scraper, node express proxy, fast proxy scraper, looking articles publication, express proxy, website database extract, url mask, build bot php, software architecture proposal, open news website, php extract information website

Om arbejdsgiveren:
( 0 bedømmelser ) Australia

Projekt-ID: #5988777

40 freelancere byder i gennemsnit $2700 for dette job

esolzsales

There are some questions before we place a firm bid. Let us know if you have some time so we can both converse regarding the questions / doubts we have.

$5154 USD in 60 dage
(248 bedømmelser)
9.5
omsoftware

Hello, I read and understand your project details for developing NEWS web with Scrapping and NEWS aggregation and are sure to deliver you BEST quality work. We have developed several similar webs, please check Mere

$3500 USD in 45 dage
(95 bedømmelser)
8.6
codeguru786

Hi, we have read all the requirements and we are very confident to do this project from start till end. We definitely have some questions when we discuss project with you. Due to the change of bid system at freelancer. Mere

$2782 USD in 25 dage
(260 bedømmelser)
8.4
krishdts

Greetings, This is Kris from DreamzTech Australia. Please open the chat room so we can discuss this in details. To get an idea of the quality of our work you can check the below link : [url removed, login to view] Mere

$2680 USD in 35 dage
(77 bedømmelser)
8.2
intelligencei

Hi! Thanks for posting this requirement. We are TOP RATED professionals at [url removed, login to view] working since 2004. We are experts with design and development works. You may refer to our reviews and ranking here: http Mere

$2886 USD in 35 dage
(84 bedømmelser)
8.2
SigmaVisual

Dear Client, I can help in your project. We have already experience of working on similar projects. Please see below to get idea of our experience: Amazon/Ebay Bots: [url removed, login to view] Mere

$1578 USD in 30 dage
(248 bedømmelser)
7.9
buraqtech

1. Vollks Australian Online Store In these days we are complete re-designing and reimplementing an online store for our Australian client in which we are using X-Cart latest stable release to implement his all require Mere

$2551 USD in 35 dage
(88 bedømmelser)
7.9
PSSPL2000

Hi Greetings!! Thank you very much for giving me opportunity to bid on your project. I have gone through your requirement and can surely help you with this wonderful project. can you please share detail requir Mere

$2368 USD in 30 dage
(47 bedømmelser)
7.6
gopalvora

Hi I have gone through the details of your project and we find it well within our capabilities. I offer a wide range of services, including Web design, PHP/MySQL web application development, Open sources like Joo Mere

$1546 USD in 30 dage
(254 bedømmelser)
7.3
RaddyxTechnology

Hi, I have gone through your project description and could assure you the best of the solution. I am taking it as an opportunity and ready to deliver my best. We will be using open source LAMP technology to dev Mere

$2319 USD in 45 dage
(59 bedømmelser)
7.2
onelinewebdeUK

Thanks for your consideration, if you would like to discuss the project further do get in touch and I’ll give you a call.

$2736 USD in 35 dage
(14 bedømmelser)
7.0
softwaredep

Dear Sir/Madam, Yours requirements + Our effort = Valuable Working solution. That's what we deliver. We have team of dedicated programmer to complete project with perfection. We are looking forward to hear bac Mere

$2500 USD in 30 dage
(99 bedømmelser)
7.0
fattahaabdul

Hello Sir, We have gone through the details you have provided and would be pleased to work on this with you to deliver the results that you have expected and We are sure you will not be disappointed if you give us this Mere

$3092 USD in 30 dage
(18 bedømmelser)
6.9
ganeshbaid

Hello working example website: [url removed, login to view] [url removed, login to view] [url removed, login to view] We are expert in web application & web design(with mobile responsive) and our review speaks about our work, kindly have a l Mere

$3000 USD in 30 dage
(139 bedømmelser)
6.9
otssols

Hello, Thanks for your post ! ...As per my UNDERSTANDING : .......[url removed, login to view] want us to develop a bot which will crawl into 50-100 news websites to scrape info & store into a SQL database. ........[url removed, login to view] achieve this objectiv Mere

$2268 USD in 30 dage
(15 bedømmelser)
6.8
zahmaci

i clearly understand your needs and i am ready to take your project.. If you check my recent reviews you can understand my skills.. king regards

$6666 USD in 25 dage
(203 bedømmelser)
6.6
helmot

Edit: IF YOU TELL ME ABOUT WHAT EXACTLY YOU WANT TO DO, I may be able to do what you want with just $1500 with unlimited number of news websites! I have +10 years of experience in Python programming. If you take Mere

$1500 USD in 10 dage
(55 bedømmelser)
6.6
C0RETECHIES

Hello, To quickly introduce myself, I am with Core Techies who are functioning from last 6+ years, having expert developers in Website Development & Customization/Mobile Application [url removed, login to view] are ready to discu Mere

$2577 USD in 15 dage
(45 bedømmelser)
6.0
saginfotech

Hi.. Went through your requirement to create a website in PHP for showing news content. The content will be scraped from different news websites. The job can be done with all the specifications mentioned in the descri Mere

$2525 USD in 45 dage
(38 bedømmelser)
5.8
subhomayb

Hi, Greetings of the day..!!! We have prior experience in news related website development and please have a look: [url removed, login to view] [url removed, login to view] [url removed, login to view] ====== Mere

$1500 USD in 39 dage
(23 bedømmelser)
5.6