Lukket

carspider

Large-scale research needs to be done on automotive classifieds listings, spanning multiple auto classifieds websites (e.g. yahoo autos, google autos, cars.com... ~10 sites total).

This project is the first step of this research - Gathering listings from all the auto classifieds websites and storing them in a database. Essentially, the sites need to be spidered, and the individual ads parsed and dumped into a SQL database. This is also to be thought of as a small starter project to get the developer's feet wet, and to determine whether the skill set of the developer is sufficient for larger projects to come.

Although the below can be changed if there are good reasons to do so, the below architecture is how I envision this project to be structured:

There will be mainly two tables in the database:

new ads - New ads found on the listing websites that are determined to not exist in the "new ads" or the "committed ads"

committed ads - Previously "new ads" determined by a human being to be valid

Based on that, I envision this project to be composed of the following modules, listed bottom-up:

DB entry backend - Simply interacts with the SQL database, determines if the entry is a duplicate, and if not, adds the entry to the "new ads" table. Should be a separate module, because we may decide to change the criteria that determines duplicates as time goes on.

Ad backends - These can potentially be written using HTML::Parser. One must be written for each of the ~10 websites. They should interperet an ad page and pass the relevant information from it to the DB entry manager.

Listing backends - These can potentially be written using HTML::Parser. One must be written for each of the ~10 websites. The role of this is to generate and to traverse a listing of ads and call the corresponding ad backend for each ad. Can be given search criteria to minimize the number of records returned by the site to something managable. Each of these will be eventually be called upon automatically on a nightly basis with enough criteria to ensure an exhaustive search of all websites.

Scraper frontend - written in PHP - simply goes through all "new ads", displaying each ad to the user, giving the user the option to either move the ad to the committed ads or to mark the ad invalid.

Færdigheder: Perl, PHP

Se mere: carspider, yahoo autos, the role of the project manager, sql get duplicates, php developer skill set, php developer ads, mark may, human scale, how to get cgi, how can i get the good php developer, frontend auto, cars using, cars com, auto cars, call ad, the role of project manager, structured, sql duplicates, Search for Relevant Projects , project manager automotive, perl parser, parsed, move multiple websites, larger projects, i need a manager how do i get one

Om arbejdsgiveren:
( 0 bedømmelser ) Laguna Beach, United States

Projekt-ID: #293908

7 freelancere byder i gennemsnit $679 for dette job

varun8211

Pl see PMB.

$750 USD in 30 dage
(20 bedømmelser)
8.0
SigmaVisual

Please check PMB.

$750 USD in 7 dage
(249 bedømmelser)
7.9
codersam

We are absolutely suitable for this project as we have already done the different type(Social Community/Networking/Dating, Mambo, Joomla, Oscommerce, Zencart, Real State, Ecommerce/Auction, Video Related, General and Mere

$750 USD in 15 dage
(153 bedømmelser)
7.4
phpoutsource

Please check your PM

$500 USD in 7 dage
(18 bedømmelser)
5.1
shrishtiin

Please see PM.

$750 USD in 20 dage
(5 bedømmelser)
4.4
moto117

We have similar project and we can offer you the professional services, the team with over 10+ years working experiences, established in Jul, 2004. please kindly check PMB for details, thank you.

$750 USD in 10 dage
(0 bedømmelser)
0.0
imgenexindia

Greetings! We just came across this bid request and are pretty interested in moving forward from here. Now, We have a few questions and suggestions that We'd like to put forward to cut short the initial phase of pr Mere

$500 USD in 4 dage
(0 bedømmelser)
0.0