Spark data analysis (Database: Capital Bike Share)


•All the bike sharing activities

•Year 2010-2016

•Over 350 stations

•Over 13,000 trips a day this past summer

Data Description:

•Duration - Duration of trip•Start date – Includes start date and time•End date – Includes end date and time•Start station – Includes starting station name and number•End station – Includes ending station name and number•Bike # - Includes ID number of bike used for the trip•Member Type – Lists whether user was a Registered (annual or monthly) or Casual (1 to 5 day) member. NOTE: The 3-day membership replaced the 5-day during Fall '11.•AWS

Project Requirement:

Download, curate and organize the data so that you can query it when it is loaded on the Hadoop cluster•Load it on Big Data infrastructure:•Understand how to manage it on a cluster (AWS/VM)•Create processes for accessing and querying the data•Provide a set of query tools/scripts using Pig/Hive/Impala to query the data on the cluster•A large part of the process here will be to set standard query scripts on Pig and Hive/Impala to allow the user to examine the dataset

Final Purpose !!!!

Build a prediction model that predicts the demand for a certain station at a certain time.

Evner: Amazon Web Services, Hadoop, Python, Scala, Spark

Se mere: capital bikeshare key dispenser locations, capital bikeshare federal employees, bike share dataset, capital bikeshare student discount, bike sharing dataset analysis, capital bike share data visualization, capital bikeshare data, capital bike share rates, statistical analysis data mysql database, coupon loyalty card system interface database data analysis, data analysis crosstab dataset, aspnet foxpro database netware novell share, input excel data access database, using struts data oracle database jsp struts, richfaces edit table data save database inplaceinput, resume data analysis, proposal data analysis project, calling data sql database web service, php query data access database, macro move data cell database excel

Om arbejdsgiveren:
( 2 bedømmelser ) Baltimore, United States

Projekt ID: #15797529

6 freelancere byder i gennemsnit $211 på dette job


Dear Customer, My name is Yuriy Tumakha. I am interested in your project Spark data analysis. I am Senior Scala/Java Developer with strong problem-solving skills. Relevant Skills and Experience Amazon Web Services, Flere

$450 USD in 7 dage
(6 bedømmelser)

Hi, I have more than 3+ years of experience in hadoop technologies contact me for more details Relevant Skills and Experience Please have review on my profile Proposed Milestones $222 USD - Project fee

$222 USD in 3 dage
(5 bedømmelser)

I will provide exact and accurate work. Relevant Skills and Experience E-commerce Proposed Milestones $35 USD - Charge

$35 USD in 3 dage
(0 bedømmelser)

A Phd scholar, what better can you get. You can give me this task as i have requisite qualification and experience. You will be happy to see my work. Looking forward to work with you. Relevant Skills and Experience A Flere

$200 USD in 10 dage
(0 bedømmelser)

I guarantee you a first class service with the quality and in the time it deserves by putting at your disposal all my knowledge and years of practice. Habilidades y experiencia relevante have more than 6 years of expe Flere

$188 USD in 3 dage
(0 bedømmelser)

Deliver a prediction model that predicts the demand for specific stations at a specific time, as well as a set of scripts using Pig/Hive/Impala that allow users to query data. Relevant Skills and Experience Certifi Flere

$170 USD in 7 dage
(0 bedømmelser)