The script should;
1- Crawl the webpage given
2- Parse all the urls in page with different regular expressions. (don't have to start with a href or http even)
for example: parse all urls with rar,zip,mp3 etc. extensions. parse all mediafire, rapidshare etc. urls.
3-It should be able to login or load cookies to login to specific webpages such as forums etc. to get the links
4-Must be fast as much as possible and stable :).
it can be shell script, perl, c etc. important part it should be fast and not use much resources. advices about platform or techics welcome.
below is an example which I can do till here, I need so many improvements
wget -q -U "Mozilla/5.0 (X11; U; Linux i686; pl-PL; rv:220.127.116.11) Gecko/20121223 Ubuntu/9.25 (jaunty) Firefox/3.8" [url removed, login to view] -e robots=off -O - | tr "\t\r\n'" ' "' | grep -i -o '"\(ht\|f\)tps\?:[^"]\+\(.gif\|.apk\|.rar\|.mkv\)"' | sed -e 's/^.*"\([^"]\+\)".*$/\1/g' | uniq
thanks in advance
17 freelancers are bidding on average $154 for this job
Obviously it's cool to write a oneliner, but I don't think it's wise reading your requirements. Read my PM and see if you can stand all my insults. ;-)
I have good exposure to Linux (wget,curl, scripting) and C scripting and I can code your problem in a maximum of 2 days and can deliver it with all the features. I can also provide future support/changes free of cost.