I need a MP3 crawler / spider that would crawl websites from around the web for audio files and add them into a database with all of the information associated with the file such as title, artist, size, bitrate, length, and source, etc. The script should crawl the web in an efficient way and be able to index from 10000 to 50000 mp3's daily, inserting the results in a database.
It must also contain all the ID3 tag information as well (this has to be done remotely, we will NOT be saving audio files to the server).
The script must also have the ability to check the links in the database and ensure the files still exist in a cron during off-peak traffic hours so it won’t choke the server.
This script should be like [url removed, login to view], very fast and efficient. Users must be able to register and also to be able to change their information and also to be able to have a playlist add and delete songs. I also it would be nice if you create a unique web design.
The script would be hosted on a high-end dedicated server, so server resources shouldn't be a concern, but I still expect the script to be perfectly optimized, commented, indented and easy to extend(multiple databases, more features etc.).
People having previous experience will be preferred, will also require a portfolio or examples of previous jobs.