Lukket

Web Page Downloader/Parser

First of all: This should be programmed using ANSI C that compiles in GCC should be cross platform.

We need a Function that will take a web URL and download the pages html contents. (it should not download any pictures or any other external files) It should then come up with a title, description and keywords based on the meta tags. If ther are no meta tags, the title, keywords and descriptions should be be figured out like google or yahoo- in that it will ignore common words like 'a', 'the', and many others. It should also drop words that have been repeated to many times (more then 7 I think). It should also attempt to figure out the last time the page was modified - if it can't it should compare it with an internal date in the database- and store in the database only if newer. The URL, Title, Description and keywords should be saved in a database called "sites.dat" using a database function we have had developed for us.

At any point that it receives an error 301 (or any other redirect method) it should follow the link then update the URL that was passed in.

If there is a 404 or any other error preventing the page from being downloaded it should return all blank values.

Any links that it finds should be stored using a database function that we are having developed using the filename "links.dat".

This function should obey all ROBOT tags, as well as [url removed, login to view] files.

When this is being coded, you should be aware that not all sites have perfect HTML and some tags will be wrong or full of errors. Count on this function looking at badly formed html sites.

In most cases, this should act no differently as a googlebot. Though when downloading a page it should identify itself as 'dCrawler'.

Færdigheder: C programmering

Se mere: web page downloader, downloader parser, yahoo first page, web page errors, errors on page, ansi c, web page download, google web page, web html, web developed, web database, web c, us web, though, page, meta robot, like page, it web, identify, html parser, gcc, figure out, common, cases, C# web

Om arbejdsgiveren:
( 3 bedømmelser ) Brantford, Canada

Projekt-ID: #15679

11 freelancere byder i gennemsnit $251 for dette job

DougRoyer

I have similar code now - see PM

$250 USD in 5 dage
(4 bedømmelser)
6.0
ccpplinux

Hi, Crawling is our first choice. We have developed so many crawlers in PHP/MySQL and we are very much confident that we can develop a crawler in C/C++ also in GNU/Linux environment. For demo and discussion please se Mere

$150 USD in 15 dage
(6 bedømmelser)
3.6
Kami

Check pm for more details pls

$300 USD in 21 dage
(1 bedømmelse)
4.2
ralph78fr

I already worked on a similar project. (downloading/smart parsing). I may have to tune my code, since it worked under windows and in C++. Still I put 10 days in order to have time to test the app completely & carefull Mere

$250 USD in 10 dage
(0 bedømmelser)
0.0
bid5

we could do it.

$300 USD in 5 dage
(0 bedømmelser)
0.0
nidle

Dear Sir/ Madam, If you are looking for top quality and quick turnaround then we will be delighted to work up the required downloader for you. We are an IT company specializing in web technologies and programming. Mere

$300 USD in 15 dage
(0 bedømmelser)
4.6
anshumesh

We are Web development,Search Engine Optimisation and BPO company from India . Kindly go through our url [url removed, login to view] . We are interested in your project. Thanks. Regards, Anshu

$300 USD in 20 dage
(0 bedømmelser)
0.0
niftysoft

Hi there, Niftysoft Solution is a leading IT services company providing solutions across the globe. A large team of extremely professionals staffs Niftysoft Solution with a strong background in IT field and having Mere

$290 USD in 15 dage
(0 bedømmelser)
0.0
starsoft82

Dear sir, I will complete this program within 15 days to suit all your requirements. Thank you.

$125 USD in 15 dage
(0 bedømmelser)
3.8
inteconssoftware

We are a group of software professionals from India with expertise in ASP, ASPx, HTML, XML, Java, C, C++, VB, Oracle, SQL Server, PHP, My SQL Professionals ranging from 1 yr to 20 yr of experience We are sure to Mere

$250 USD in 15 dage
(0 bedømmelser)
0.0
kumartinku

Dear Sir/Madam We are group of software engineer having expertise in web technology, windows desktop application development, security and mobile technologies. Recently we have developed a project in which we are pars Mere

$250 USD in 15 dage
(0 bedømmelser)
0.0