I need to harvest data from a web site on a daily basis and need a program to do the work. I currently do it with a program I wrote that runs under windows but I need something more robust in order to run this more often and it must run on Windows XP Pro SP3 (32 bit).
The project is simple: download some html files from a web site and then extract the data in them and put them in a standard delimited ASCII file.
The program will simply go to a web site to get all the html pages that are available. The program will be run every day and will download about 100,000 html pages. Each html page has data formated for display purposes. There are three different formats throughout the 100,000 pages.
The data that is contained in the html files must be converted to standard ascii delimited format. So if the program downloads 100,000 html pages with 60 rows of data per html page, the ascii file should have 6 million lines, one per record of information. Header information must be repeated on each line for each record.
I do not need a user interface.
The program must run from the command line with the following parameters:
[url removed, login to view] delimiter=; outputfile=[url removed, login to view] startdate=YY/MM/DD enddate=YY/MM/DD url=[url removed, login to view]
This is a very simple program to write. I do not care what language you use and I do not want the source code. If the program works well I will ask you to modify it on a regular basis.
My budget is 30$. Please do not bid if you are unable to write this program for 30$.
11 freelancere byder i gennemsnit $53 for dette job
Hi, I am working in .net last from 4 yrs. If you provide me the exact sample format and other details in time. I will provide that in less day 3 days also.