Do you know what "Ghosting' means? If so, please keep reading. If not, then this project is not for you.
We are a legitimate company [url removed, login to view] that is hiring people all across the country for real jobs. We have been posting real job ads to find employees in other cities. Sometimes we put up job posting and then ghosted. The other alternatives ([url removed, login to view], Linked.in..etc) are either expensive or do not get us the results we need. We find that classified postings give us the best candidate results so we need to continue.
So we want to create a program that helps us identify if an ad has been ghosted in semi-realtime. The program takes a link and tries to find out if it appears live on the major classifieds. We also need an interface that we can watch to show the report of status. There will be a way to enter the link into the gui and that creates a list of every ad we post. The link status is either Red or Green. Red means, "does not appear or has been flagged" and Green means "live". The application would refresh the results periodically.
For this project, I would like to pay once you can demonstrate that the program works. I dont want to pay for something if it doesnt provide the status as I described.
You must have the following
- Be able to fetch web pages over HTTP
- Be able to parse the HTML for links
- Be able to determine the locations of new pages based on relative links
- Be able to ignore types of links you weren't interested in, or were malformed in some way.
- Have some kind of data structure to remember which pages have been fetched, when, and their contents.
I've made tons of web scrappers, and even performed similar projects at [url removed, login to view] (you can find them at my profile), ready to deliver a scrapper to you. But I have some questions. Please see PMB for those.