EME: An automated, elastic and efficient prototype for provisioning Hadoop clusters on-demand

Aiming at enhancing the MapReduce-based applications Quality of Service (QoS), many frameworks suggest a scale-out approach, statically adding new nodes to the cluster. Such frameworks are still expensive to acquire and does not consider the optimal usage of available resources in a dynamic manner. This paper introduces a prototype to address with this issue, by extending MapReduce resource manager with dynamic provisioning and low-cost resources capacity uplift on-demand. We propose an Enhanced Mapreduce Environment (EME), to support heterogeneous environments by extending Apache Hadoop to an opportunistically containerized environment, which enhances system throughput by adding underused resources to a local or cloud-based cluster. The main architectural elements of this framework are presented, as well as the requirements, challenges, and opportunities of a first prototype.

keywords: MapReduce, Hadoop, Big Data, Cloud computing, Prototyping, Elastic Computing, Quality of Service.