This project is for a PHP script which downloads PDF pages from a newspaper website after logging in. Although the script is not complex at all, the project is specified in detail to avoid confusion. It should not take more than 1-2 hours for an experienced PHP developer. Please bid appropriately.
The script MUST:
1. Login using credentials valid on the website.
2. Navigate to the correct date (details below).
3. Access the main page, determine the total number of pages present , start from page number 1 or as specified.
4. Save the PDF file as [url removed, login to view] locally.
5. Track which date was successfully downloaded last (of which all pages were downloaded). If the process broke midway, it should continue at that point.
6. If user passes the date (eg: 20140305) and page number (eg: 62) as GET parameters, it should begin at that page number for that particular date, before moving onto the next date where it should start from the beginning. If no page number is specified, it should start at page number 1.
7. The script should download all PDF files of that date and then proceed to the next date. It should save the last fully successful date for which all files were downloaded on the disk.
8. The script should continue until it reaches the present date and not proceed beyond that.
9. The script should output information showing the status (what date is being processed, how many files total, current page being downloaded, etc.). It does not need to show current download progress.
1. You do need to examine the website, so contact me privately and the website details will be provided for you to have a look around. Bidding without knowing the website name does not make sense.
2. This end product must be a PHP command line script that must work from a DOS/Windows environment. It can use wget for the actual downloading – that’s available on Windows as a binary.
3. If there are geo-IP restrictions present on the server, you may use a proxy service to test your code, but ultimately the code should work without a proxy on a machine / server present within India.
4. You will NOT be provided a machine / server, so please don’t ask for it. We can test your script on our server if you want to ensure that it will work at our end.
5. If the session times out or internet connection breaks, script should relogin and continue where it stopped last.
6. The script should not hammer the website and have provisions to slow down rate of access and rate of download as specifiable options.
7. The script should maintain the session as active while downloading, and take necessary steps to prevent being restricted by the server.
8. File permissions for execution and saving temporary files onto disk - some changes will be required based on whether the script is run on CentOS or DOS, so those instructions can be part of the setup steps.