MOMC: Multi-objective and Multi-constrained scheduling algorithm of many tasks in Hadoop

Home | About RECERCAT | Contact

Català | Castellano

All of RECERCAT

By Communities &
Collections By Defense Date By Authors By Titles By Subject

This Collection

By Defense Date By Authors By Titles By Subject

Statistics

View Statistics All RECERCAT

My RECERCAT

Other repositories directory

RECERCAT Home > Universitat Politècnica de Catalunya > Documents de recerca > View document

To access the full text documents, please follow this link: http://hdl.handle.net/2117/104957

Title:	MOMC: Multi-objective and Multi-constrained scheduling algorithm of many tasks in Hadoop
Author:	Voicu, Cristiana; Pop, Florin; Dobre, Ciprian M.; Xhafa Xhafa, Fatos
Other authors:	Universitat Politècnica de Catalunya. Departament de Ciències de la Computació
Abstract:	(c) 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
Abstract:	Even though scheduling in a distributed system was debated for many years, the platforms and the job types are changing everyday. This is why we need special algorithms based on new applications requirements, especially when a application is deployed in a Cloud environment. One of the most important framework used for large-scale data processing in Clouds is Hadoop and its extensions. Hadoop framework comes with default algorithms like FIFO, Fair Scheduler or Capacity Scheduler, and Hadoop on Demand. These scheduling algorithms are focused on a different and single constraint. It is hard to satisfy multiple constraints and to have a lot of objectives in the same time. After summarizing the most common schedulers, showing the need of each one in the moment it appeared on the market, this paper presents MOMC, a multi-objective and multi-constrained scheduling algorithm of many tasks in Hadoop. MOMC implementation focuses on two objectives: avoiding resource contention and having an optimal workload of the cluster, and two constraints: deadline and budget. To compare the algorithms based on different metrics, we use Scheduling Load Simulator, which is integrated in Hadoop framework and helps the developers to spend less time on testing. As killer application that generate many tasks we have chosen processing task for the Million Song Dataset, which is a set of data contains metadata for one million commercially-available songs.
Abstract:	Peer Reviewed
Subject(s):	-Àrees temàtiques de la UPC::Informàtica -Cloud computing -Big data -Apache Hadoop -big data -cloud computing -Hadoop -map reduce -task scheduling -Computació en núvol -Macrodades -Apache Hadoop (Programes d'ordinador)
Rights:
Document type:	Article - Submitted version Conference Object
Published by:	Institute of Electrical and Electronics Engineers (IEEE)
Share:

Show full item record

All of RECERCAT

This Collection

Statistics

My RECERCAT

Related documents

Other documents of the same author