Migration of Python app to Amazon Elastic MapReduce

Enable a Python application to run in [][1][Amazon Elastic MapReduce][1] environment by modifying well-documented and well-structured source-code.

The original application was developed to retrieve Wikimapia information and designed to enable proto-parallel processing: it can subdivide one task in order to run it in parallel in multiple computers and then collates the results. However, it was not developed to take advantage of [][2][Hadoop][2].

## Deliverables

# Preliminary Analysis:

Amazon provides an extended example on how to distribute Python processes ??" check [][3][Finding Similar Items with Amazon Elastic MapReduce, Python, and Hadoop Streaming][4]? to get an idea of the desired result.

The application to adapt has fewer than 900 lines. See attached [[url removed, login to view]] for original project requirements - which were achieved superbly? - and the signatures of the two components that comprise the working application.

# Required Knowledge:

Familiarity with Hadoop and Amazon AWS is essential. Although Amazon Elastic MapReduce was just deployed, anyone with experience with Hadoop will pick it up fast. Some experience with [][5][Hadoop Streaming][5] is a plus.

At ease with Python: some modifications to the original code will have to be made but it will mostly require being able to reorganize the code for Hadoop processing.

# Deliverables:

# 1. Modified Python scripts to run in Amazon Elastic MapReduce;

2. Documentation and working examples on how to use the scripts in Amazon Elastic MapReduce.

Evner: Amazon Web Services, Ingeniørarbejde, Linux, MySQL, PHP, Software Arkitektur, Software Testning, Web Hosting, Hjemmeside Management, Hjemmeside Testning

Se mere: well being amazon, proto app, how to code computers, get an app developed, aws scripts, app get linux, python streaming, python hadoop, hadoop and python, subdivide, python task, python application, python 3, proto, migration application, mapreduce, Linux python, linux aws, hadoop, hadoop project

Om arbejdsgiveren:
( 2 bedømmelser ) Dubai, United Arab Emirates

Projekt ID: #3803076

2 freelancere byder i gennemsnit $446 på dette job


See private message.

$850 USD in 14 dage
(0 bedømmelser)

See private message.

$42.5 USD in 14 dage
(0 bedømmelser)