I'm looking for a spider or crawler software that will automatically scan webpages and follow links. The software should extract all domainnames (.com/net/org/info/de/ch/at etc. no subdomains) and email adresses from the webpages and write them to a dailylogfile (one logfile for domainnames and one for emails). After the crawl a indexer should cleanup the textfile and display only unique datasets.
the crawler should run each night and should avoid visiting the same pages each day. mailny i'm interested in Domainnames from europe, so it would be a good idea to limit the crwaler to european IP adresses.
I found a software called phpcrawl and I want to have a modification to it ([url removed, login to view])
The software should run on a LAMPP System.
Thanks for a offer