I need someone to help me build a script that I can access the data of the Common Crawl corpus.
I have an Amazon S3 account (which I'll need you to go into to set it up correctly to run the script and access the Corpus)
I have a web server that is PHP/MYSQL running Apache, Cpanel and WHM.
Here's the info to get started with the common crawl:
[url removed, login to view]
I need to find:
- Mentions of keyword on website
- Output will be the list of web pages
- Which words appear around a particular keyword on a website
- Sentiment around a keyword or phrase
- Which web pages link to a given URL
I'll need to be able to apply a sub filter of data which will be split into day, week, month and year so I can create trends from the data.
If you look at Github most of these scripts have been made already and are you can build upon these.
Also, any data we use needs to be cached on my server for quicker retrieval.
Ready to start this tomorrow.
I'm [url removed, login to view] on skype.
11 freelancere byder i gennemsnit $644 på dette job
Hi I work towards providing reliable, relevant and robust IT solutions at most competitive prices to my customers. I ensure 100% customer satisfaction so lets start Thanks