Overview: The article scraper will be a production library that runs as a cron script that will pull articles from various pre-specified sources, download them to our data-store and populate a database table using a number of extracted and calculated values.
Inputs: list of sources, list of stocks, data/time range.
Article scraping: The scraper will find all articles from a given source (e.g., Moody’s, [login to view URL]; gurufocus; Fidelity, [login to view URL]) for a given stock ticker, (e.g., AAPL, MMM), for a specific date range. When applicable, this will using the source’s search API or form and extracting the published date from the results or may need to be extracted from the article itself.
Run Schedule: In phase 1, the scraper will run every 24 hours.
The library will be run for as many sources as can be programmed for the 500 stocks in the database.
For the first run of the scraper, historical data will be extracted since the beginning of the year. In subsequent runs new articles since the last run will be downloaded.
17 freelancere byder i gennemsnit $168 på dette job
Here is Web scrapping & Data crawling Expert with full experience and best quality. Relevant Skills and Experience full experience Proposed Milestones $155 USD - full