
Closed
Posted
I have a collection of English-language digital documents that first need thorough data cleaning before any further processing. Your task begins with removing duplicates, fixing inconsistent formatting, and standardising fields so the dataset is error-free and ready for analysis. All material is already in digital form, so you can work directly with the files I provide—no scanning or manual typing from paper. Once the data is pristine, selected sections will also require translation. We can confirm the target language(s) together, but your ability to deliver an accurate, context-aware translation after cleaning will be a strong advantage. Deliverables: • A cleaned set of English documents in their original file format, plus a summary of the issues found and the fixes applied. • Translated version(s) of the cleaned text, clearly aligned with the source for easy cross-checking. If you have solid experience in data cleaning tools (Excel, Google Sheets, Python scripts, OpenRefine) and a proven translation background, I’d like to hear how you would tackle both phases and your approximate turnaround time.
Project ID: 40406406
17 proposals
Remote project
Active 25 secs ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs