To design a MapReduce based algorithm (you can pick any machine learning
algorithm you like) which can handle huge datasets using a large number of computers. In
addition, you should provide proper data analysis for your algorithm. Questions you need to
answer in your report such as:
1. What is your data processing pipeline? (graphs and words description)
2. What kind of analytics do you apply on the dataset?