News website scraper/bot, news aggregator

News Aggregator/Scraper/Bot

We need to build an infrastructure capable of scraping/indexing approximately 50-100 news websites.

The bots/scrapers should be equipped to quickly request, extract and parse news article URLs as they are published to the various news websites - in near 'real time'. The major focus of this project is to ensure that 1) News content/URLs are parsed and entered into a Database as fast as possible (within minutes of publication), and 2) that ALL news content/article URLs are captured (avoid missing content). We are looking to build a robust, easily scaled news aggregator system of bots/scrapers that can easily handle the load of 10,000 - 20,000 news articles per 24 hours, and where adding *new* sources of news (publishers, websites) is reasonably easy to do. We are open to ideas on how this is achieved.

The scraper/bot should capture the following information:

- Unique article URL

- Headline

- Body of text

- Author name

- Publisher URL (IE: [login to view URL])

The indexed information should be entered into an SQL database with relevant fields, unique article_ID and tables according to the above indexed information.

A list of the relevant news websites to be indexed can be obtained on request. They are primarily Australian, New Zealand, US and European news websites. All English language, all properly formatted.

We utilise Linode servers (Ubuntu OS) and operate an SQL database. We would prefer the scraper/bots to be completed in Node.js.

We have a Proxy Rotator that can manage 'requests' from the scraper/bots, so as to 'mask' multiples of visits to news websites.

*** Please note: we are seeking developers with strong experience in scraping/content extraction automation. Our primary focus is to ensure that ALL new news articles added to a news websites from the start point are captured, AND that all news articles are captured as quickly as possible. This content needs to be collected 24/7, and within 'minutes' of publication at the news website. Please only post a proposal or express interest if you believe you have the necessary skills to complete this task. ***

Evner: PHP, Software Arkitektur

Se mere: software developers websites, proxy sources, publisher website, Node js website, node js project, node express, news website, news aggregator, name new zealand, load express, linode, automation bot, php open url extract, indexing url, build php scraper, node express proxy, fast proxy scraper, looking articles publication, express proxy, website database extract, url mask, build bot php, software architecture proposal, open news website, php extract information website

Om arbejdsgiveren:
( 0 bedømmelser ) Australia

Projekt ID: #5988777

39 freelancere byder i gennemsnit $2692 på dette job


There are some questions before we place a firm bid. Let us know if you have some time so we can both converse regarding the questions / doubts we have.

$5154 USD in 60 dage
(313 bedømmelser)

Hi, we have read all the requirements and we are very confident to do this project from start till end. We definitely have some questions when we discuss project with you. Due to the change of bid system at freelancer. Flere

$2782 USD in 25 dage
(399 bedømmelser)

Hello, I read and understand your project details for developing NEWS web with Scrapping and NEWS aggregation and are sure to deliver you BEST quality work. We have developed several similar webs, please check Flere

$3500 USD in 45 dage
(139 bedømmelser)

Greetings, This is Kris from DreamzTech Australia. Please open the chat room so we can discuss this in details. To get an idea of the quality of our work you can check the below link : [login to view URL] Flere

$2680 USD in 35 dage
(91 bedømmelser)

Hi I have gone through the details of your project and we find it well within our capabilities. I offer a wide range of services, including Web design, PHP/MySQL web application development, Open sources like Joo Flere

$1546 USD in 30 dage
(489 bedømmelser)

Edit: IF YOU TELL ME ABOUT WHAT EXACTLY YOU WANT TO DO, I may be able to do what you want with just $1500 with unlimited number of news websites! I have +10 years of experience in Python programming. If you take Flere

$1500 USD in 10 dage
(177 bedømmelser)

Dear Client, I can help in your project. We have already experience of working on similar projects. Please see below to get idea of our experience: Amazon/Ebay Bots: [login to view URL] Flere

$1578 USD in 30 dage
(266 bedømmelser)

Hello Sir, We have gone through the details you have provided and would be pleased to work on this with you to deliver the results that you have expected and We are sure you will not be disappointed if you give us this Flere

$3092 USD in 30 dage
(92 bedømmelser)

1. Vollks Australian Online Store In these days we are complete re-designing and reimplementing an online store for our Australian client in which we are using X-Cart latest stable release to implement his all require Flere

$2551 USD in 35 dage
(97 bedømmelser)

Hi Greetings!! Thank you very much for giving me opportunity to bid on your project. I have gone through your requirement and can surely help you with this wonderful project. can you please share detail requir Flere

$2368 USD in 30 dage
(50 bedømmelser)

Hello, Thanks for your post ! ...As per my UNDERSTANDING : .......[login to view URL] want us to develop a bot which will crawl into 50-100 news websites to scrape info & store into a SQL database. ........[login to view URL] achieve this objectiv Flere

$2268 USD in 30 dage
(36 bedømmelser)

Hello, To quickly introduce myself, I am with Core Techies who are functioning from last 6+ years, having expert developers in Website Development & Customization/Mobile Application [login to view URL] are ready to discu Flere

$2577 USD in 15 dage
(98 bedømmelser)

Hi, I have gone through your project description and could assure you the best of the solution. I am taking it as an opportunity and ready to deliver my best. We will be using open source LAMP technology to dev Flere

$2319 USD in 45 dage
(101 bedømmelser)

Hello working example website: [login to view URL] [login to view URL] [login to view URL] We are expert in web application & web design(with mobile responsive) and our review speaks about our work, kindly have a l Flere

$3000 USD in 30 dage
(141 bedømmelser)

Hi, Gone through your project description and looking forward for more discussion. We can build customized/tailor-made infrastructure capable of scraping/indexing approximately 50-100 news websites. We offer Flere

$2105 USD in 15 dage
(49 bedømmelser)

i clearly understand your needs and i am ready to take your project.. If you check my recent reviews you can understand my skills.. king regards

$6666 USD in 25 dage
(203 bedømmelser)

My team is really interested in this project as it is exactly within our scope of expertise: We are WEB application design & development experts. Please kindly visit our website [login to view URL] to learn more about us and Flere

$4888 USD in 60 dage
(18 bedømmelser)

Dear Sir/Madam, Yours requirements + Our effort = Valuable Working solution. That's what we deliver. We have team of dedicated programmer to complete project with perfection. We are looking forward to hear bac Flere

$2500 USD in 30 dage
(100 bedømmelser)

Hello , I am web developer and designer , I am skilled in PHP,Mysql,sql,wordpress,opencart, ajax, jquery,html,Css. I can also convert psd to html and [login to view URL] is my recent work examples. I can a Flere

$2368 USD in 30 dage
(54 bedømmelser)

Hi.. Went through your requirement to create a website in PHP for showing news content. The content will be scraped from different news websites. The job can be done with all the specifications mentioned in the descri Flere

$2525 USD in 45 dage
(47 bedømmelser)