I gang

simple web data extractor in php -curl

We would like a php file that extract data from websites. I suppose with you should use curl, but im not expert.

Steps for php:

1- php must connect with a mysql server (remote) configured in a [url removed, login to view]

2- php must find 1st data unproccesed (lastchange=null) and block it ( to prevent being used by other php process)

3- php must do work descripted below

4- php must write mysqltable with results and unblock this data record.

Process for php:

1-Visit an url and extract from home page metatags = `Title`+`description`+`keywords` and `date_of_html`

2-Spider the first 10 links (only inbounds not external) found in home page to extract: emails + phones + fax

After order that info extracted , update mysql records, unblock used record and start a new url from table.

All outbound links and extras emails found in process will be added to mysql2.

To prevent eating resources after each process php should leave memory or something like this.

If no records are found , an alert should send by email to an administrator to add new records to the mysqltable.

--> Mysql1 for url is as attached in .sql <--

TABLE `url` (

`codigo` int(11) NOT NULL auto_increment

`email` varchar(50) default NULL,

`origendeldato` varchar(30) default NULL,

`url` varchar(50) NOT NULL,

`Title` varchar(250) default NULL,

`description` varchar(250) default NULL,

`keywords` varchar(250) default NULL,

`telefono` varchar(20) NOT NULL,

`fax` varchar(20) default NULL,

`pais` char(2) default NULL,

`empresa` varchar(50) default NULL,

`nombre` varchar(50) default NULL,

`rubro` varchar(20) default NULL,

`lastchange` timestamp NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP ,

PRIMARY KEY (`codigo`),

UNIQUE KEY `base` (`base`),

UNIQUE KEY `email` (`email`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1;

--> Mysq2 for extraemails <--

`email` varchar(50) default NULL,

`url` varchar(50) NOT NULL,

when i said " `Title`+`description`+`keywords` " <-- this is htmls metatags

Færdigheder: Apache, C++ Programmering, Databehandling, Linux, PHP

Se mere: spider external links, curl web data spider extractor, table extractor php curl, curl web extractor, work from home resources, work from home phones, work from home administrator, web config php, the simple web, remote work php, remote work from home, primary data, php from home, mysql in memory engine, memory engine, in home work, php curl extractor, php curl email extractor, web fax, find remote, web data, timestamp, telefono, php remote work, outbound process

Om arbejdsgiveren:
( 13 bedømmelser ) Buenos aires, Argentina

Projekt-ID: #1004781

Tildelt til:

SigmaVisual

We can help in your project, please check PMB and our ratings/reviews to get idea of our experience.

$100 USD in 5 dage
(239 bedømmelser)
7.8

5 freelancere byder i gennemsnit $132 for dette job

srinichal

I can deliver this asap

$180 USD in 4 dage
(92 bedømmelser)
6.9
wildlily980

I can do this based on php

$70 USD in 2 dage
(41 bedømmelser)
5.9
dracco

see your pmb please

$50 USD på 1 dag
(135 bedømmelser)
5.8
getndone

Please check pm

$250 USD in 5 dage
(37 bedømmelser)
5.4
bobbyrock

Hello,I'm ready to do [url removed, login to view] check pm.

$60 USD på 1 dag
(4 bedømmelser)
2.9