Find Jobs
Hire Freelancers

php Website-Crawler and information extraction

$30-250 USD

Igangværende
Slået op over 14 år siden

$30-250 USD

Betales ved levering
Hi, I need a Crawler script, which reads URLs from a website and casts the underlying websites. The information I need from that websites are: - Mister/Miss (Herr/Frau) - Forename - Name - Position - Name of organization - Street - Address - Fon and Fax - Email - Website - link title leading to that website The crawler should look for this information first under "contact" and then in "disclaimer". It also could be possible, that the crawler find an intro-page, which it has to skip. If there are several data records an one webpage, it should be saved in the same line. The Output must be a CSV or Excel-File. Because these Websites are in german the word: contact -> Kontakt disclaimer -> Impressum Furthermore the Crawler should recognize if there's a position describtion of the contact person. For example "Stadtwehrführer", "Kommandant" or "Stadtwehrleiter". "*leiter" or "*führer" e.g. indicates a position. Also the crawler should recognize the name of the organization. "Feuerwehr" ist the indicator. An example: ________________________________________ Verantwortlich: Feuerwehr Bitterfeld-Wolfen (Name of organization) PD-Chemiemark Areal A, Geb. 046 Ortsteil Wolfen 06766 Bitterfeld-Wolfen (postal code + city) postal code has 5 numbers in GER Vertreten durch: Herr Uwe Wagner (Mister Forename Name) Stadtwehrleiter (Position) Kontakt: Telefon+49 (0) 03494 6660564 (Phone Nr) E-Mail: abcd(at)[login to view URL] (need inteligent scan, a correct email address is the most important) ____________________________________________________ The links-list can be found here: [login to view URL] Beside an csv file with all extracted data, I need the script to modify and tune it a little bit afterwards. All Phone and Fax Nrs have to be in the same format! If you need further information don't hesitate to contact me. Best regards Sebastian PS: I attached an example file of "FF Bad Waldsee" in "Baden-Württ.".
Projekt-ID: 604513

Om projektet

9 forslag
Projekt på afstand
Aktiv 14 år siden

Leder du efter muligheder for at tjene penge?

Fordele ved budafgivning på Freelancer

Fastsæt dit budget og din tidsramme
Bliv betalt for dit arbejde
Oprids dit forslag
Det er gratis at skrive sig op og byde på jobs

Om klienten

Flag for GERMANY
Köln, Germany
5,0
1
Medlem siden jan. 9, 2010

Klientverificering

Tak! Vi har sendt dig en e-mail med et link, så du kan modtage din kredit.
Noget gik galt, da vi forsøgte at sende din mail. Prøv venligst igen.
Registrerede brugere Oprettede jobs i alt
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Indlæser forhåndsvisning
Geolokalisering er tilladt.
Din session er udløbet, og du er blevet logget ud. Log venligst ind igen.