Annulleret

Creation of a Blog Crawler

Project Brief: Blog Crawler Tool

To create a tool that can analyse a multiple number of blog URLs (from a .txt document of a .csv/.xls spreadsheet) and extract all the outgoing blog links, noting them down and returns the following information.

All outgoing links pointing to other blog URLs. It is vitally important that these are blogs and not normal websites – the tool will need to be able to not just take down every URL we are looking for the outgoing links not on every part of the blog but rather the piece that says "favourite Blogs" or "Related Blogs" rather than all of the hyperlink embedded within the text as deep links. The tool will need to be able to register this and select hyperlinks accordingly. It is normally the case that these links are usually submitted on the first page and replicated onto other pages within the blog. The tool should also be able to do the following:

Removal of any duplication URLs

Only inclusion of homepage URLs – as opposed to individual blog posts

Google PageRank of each blog URL found(there are constrictions on the number of requests that can be made per day for Google PageRank scores. Therefore, the tool will have to be able to utilise different proxy addresses to circumnavigate this problem)

Technorati Authority of each blog URL (there are constrictions on the number of requests that can be made per day for Technorati Authority scores. Therefore, the tool will have to be able to utilise different proxy addresses to circumnavigate this problem)

Blog Title - This is contained in the Meta Data of almost every blog as mentioned earlier you are aware of sourcing this data of which needs to be added into the tool – if this isn’t available in the meta data the tool should accommodate for this so that the title can be extracted.

Blog Description - This is contained in the source code of almost every website – if this is not available within the source code the tool should find this information from the “About” section – very common in blogs

Blog Keywords - This is contained in the source code of almost every website, for SEO purposes – from our earlier conversation - for the last point we would realistically need the title and description of the blog - this information can normall be found on the blog as text, we also require the keywords so that the blog can be catagorised by subject. This could be achieved in the following way then....either....The tool writes down all the tags from all of the posts and remover the duplicates or it picks out the meta tags (or both)

Tool Process

Input the proxies into a appropriately titled .txt file

Input multiple URLs into a .txt file

Blog Tool goes through each URL one-by-one finds all of the URLs found and for each URL found populates an excel in the following way:

Færdigheder:

Se mere: blog crawler script, blog crawler, you proxy google, website creation in google, to create a blog, process of website creation, out sourcing google, excel blog, could of, blogs blog, blog pages, blog information, blog crawler technorati, crawler code blogs, free blog crawler software, blog crawler code, google xls, creation of website, Create a blog website , blog blogs, website out sourcing, website crawler, seo data removal, Remover, pointing

Om arbejdsgiveren:
( 42 bedømmelser ) London, United Kingdom

Projekt-ID: #297926

10 freelancere byder i gennemsnit $588 for dette job

excelence

i can help you for a fair budget,thanks

$750 USD in 0 dage
(163 bedømmelser)
6.4
omsoftware

Hello, we udnerstand your project and we are able to do it , we have Exp. in Crawling . Lets Dsicuss more , waiting for your reply raj

$750 USD in 15 dage
(213 bedømmelser)
5.4
pgcoding

Crawling experts are here. Please check pmb.

$700 USD in 5 dage
(56 bedømmelser)
5.1
CaaamSoftware

Please check PM.

$500 USD in 10 dage
(2 bedømmelser)
1.3
ddsuresh

I can provide the best solution to you on given time frame. Thanks, Suresh

$300 USD in 7 dage
(0 bedømmelser)
0.0
mlani101

i am a 5 year experienced in web development project, trust me, dis proj works well, please post PM to let me [url removed, login to view]

$400 USD in 15 dage
(0 bedømmelser)
0.0
shuhongzheng

I'm good at this and can show you some script examples which are similarly matching your current requirement.

$480 USD in 7 dage
(2 bedømmelser)
0.0
bmunjal

PreetSoft Infotech is a professional website development, hosting and design company. We can put your business on the World Wide Web, establishing a 24-hour-a-day, 365-days-a-year storefront or advertisement. Website Mere

$500 USD in 10 dage
(0 bedømmelser)
0.0
dsenthilkumar

I already many crawlers from basic Site Monitor to Product Crawlers. I can complete your Blog Crawler tool as best. Thanks, D SENTHILKUMAR.

$750 USD in 7 dage
(0 bedømmelser)
0.0
headvances

Hi, We have a crawler that currently track few hundreds thousands of blog. Please check PM

$750 USD in 30 dage
(0 bedømmelser)
0.0