Clean and Standardize Plaintext Data
$100-500 USD
Betalt ved levering
The US Department of Energy publishes the fuel efficiency (i.e. MPG) of all vehicles sold in the US each year. These statistics are available from 1978-2009 at [url removed, login to view] We need to import these statistics into an application we've built. However, the files are not all formatted consistently. We are looking for someone (or a team) to go through the data for each year 1978-2009 and reformat the data in a consistent format.
## Deliverables
The deliverables are a set of CSV files, one for each year from 1978 to 2009, inclusive. (i.e. we expect 31 CSV files). Each year's data should be saved as a CSV file with the filename [year].csv. So, for example, the 2004 data should be saved in 2004.csv. Each CSV file should contain the following columns: - YEAR - MAKE - MODEL - TRANS_TYPE - DISPLACEMENT - CYLINDER_COUNT - FUEL_TYPE - CITY_MPG - HIGHWAY_MPG - COMBINED_MPG Valid values for TRANS_TYPE are: - 'A' (automatic) - 'M' (manual) Valid values for FUEL_TYPE are: - 'R' (regular petrol) - 'P' (premium petrol) - 'D' (diesel) - 'E' (E85 Ethanol) - 'C' (compressed natural gas) The TRANS_TYPE, DISPLACEMENT, and CYLINDER_COUNT fields are optional. I recognize that some of the older data files won't have these fields. When a data file doesn't specify a fuel type, you should assume that it is an 'R'. I went ahead and converted the 2009 data to this format to give you an example of what I'm looking for. That file is attached.
## Platform
n/a
Projekt ID: #3329225