Lukket

End to End Big data project

Problem Statement:

Imagine you are part of a data team that wants to bring in daily data for COVID-19

test occurring in New York state for analysis. Your team has to design a daily

workflow that would run at 9:00 AM and ingest the data into the system.

API:

[login to view URL]

By following the ETL process, extract the data for each county in New York state from

the above API, and load them into individual tables in the database. Each county table

should contain following columns :

❖ Test Date

❖ New Positives

❖ Cumulative Number of Positives

❖ Total Number of Tests Performed

❖ Cumulative Number of Tests Performed

❖ Load date

Implementation options:

1. Python scripts to run a daily cron job

a. Utilize SQLite in memory database for data storage

b. You should have one main standalone script for a daily cron job that

orchestrates all other remaining ETL processes

c. Multi-threaded approach to fetch and load data for multiple counties

concurrently

2. Airflow to create a daily scheduled dag

a. Utilize docker to run the Airflow and Postgres database locally

b. There should be one dag containing all tasks needed to perform the end

to end ETL process

c. Dynamic concurrent task creation and execution in Airflow for each county

based on number of counties available in the response

Implement unit and/or integration tests for your application

Evner: Python, PySpark, Docker, Hadoop, PostgreSQL

Se mere: big data entry project, can data entry big project, give big project data entry, big data big analysis project management, i have big data entry project with high budget but i dont accept fake profiles in freelancer need original profiles with origina, explain what you understand by the term data entry and the problem, big data freelancing project, This is a simple data entry project. I have some text files. You need to work with those text files and copy them, big data hadoop project ideas, big data analytics project management pdf, big data capstone project ideas, big data analytics project report, big data hadoop project report, big data analytics project ideas, big data live project, big data ecommerce project, big data hadoop project report pdf, describe one hard technical problem you faced in your last major project., big data freelance project

Om arbejdsgiveren:
( 0 bedømmelser ) Los Angeles, United States

Projekt ID: #29026380

5 freelancere byder i gennemsnit $231 timen for dette job

nmogilip

Hi, I am a certified big data developer, used pyspark in my many of applications, I feel you should use pyspark for multithreaded applications as spark distribute the load into different node and executors. If you hav Flere

$244 USD in 7 dage
(6 bedømmelser)
4.2
FaizMgeek

Hello There! I can help youi have done similar Data analysis for Covid before i can do this i am working on Python since last 4 years and i am a Data science guy I am a professional web developer and have huge Experie Flere

$250 USD in 4 dage
(5 bedømmelser)
3.2
ashishpatel0720

Hi I am Ashish, I am working as Software Engineer III - Data for Walmart, Previously I was with Deutsche Bank. I have total experience of 3 years in BigData, Java Spring, Competitive Programming. I am just trying out Flere

$200 USD in 7 dage
(0 bedømmelser)
0.0
sanjayrathore556

hello i am fullstack rubyonrails developer 3 year of experience . i am also working on govt covid data to analysis with bigdata i have team having much experience in python spark hadoop hive and kafka

$222 USD in 2 dage
(0 bedømmelser)
0.0
geoman6743

I have worked extensively in python , ETL , database and airflow in both linear and distributed environment.

$240 USD in 5 dage
(0 bedømmelser)
0.0