Abstract
There has been an enormous growth in the amount of digital data, and numerous software frameworks have been made to process the same. Hadoop MapReduce is one such popular software framework which processes large data on commodity hardware. Job scheduler is a key component of Hadoop for assigning tasks to node. Existing MapReduce scheduler assigns tasks to node without considering node heterogeneity, workload type, and the amount of available resources. This leads to overburdening of node by one type of job and reduces the overall throughput. In this paper, we propose a new scheduler which capture the node resource status after every heartbeat, classifies jobs into two types, CPU bound and IO bound, and assigns task to the node which is having less CPU/IO utilization. The experimental result shows an improvement of 15–20 % on heterogeneous and around 10 % of homogeneous cluster with respect to Hadoop native scheduler.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Technical Report, Google (2004)
Hadoop Distributed File System. http://hadoop.apache.org/common/docs/current/hdfs_design.html
Hadoop MapReduce. http://hadoop.apache.org/MapReduce
Amazon Elastic Map Reduce, http://aws.amazon.com/elasticmapreduce/
Joseph, A.D., Katz, R., Zaharia, M., Konwinski, A., Stoica, I.: (2008) Improving MapReduce performance in heterogeneous environments. In: OSDI’08. USENIX Association, Berkeley, pp. 29–42 (2008)
Fair Scheduler. http://hadoop.apache.org/MapReduce/docs/r0.21.0/fairscheduler.html
Yahoo! Inc. Capacity Scheduler. http://developer.yahoo.com/blogs/hadoop/posts/2011/02/capacity-scheduler/
Chen, Q., Zhang, D., Guo, M., Deng, Q., Guo, S.: SAMR: a self-adaptive MapReduce scheduling algorithm in heterogeneous environment. In: 10th IEEE International Conference on Computer and Information Technology (CIT 2010), pp. 2376–2743 (2010)
Arun kumar, K., Konishetty, V.K., Voruganti, K., Prabhakara Rao, G.V.: CASH: context aware scheduler for Hadoop. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics, New York, 2012, ICACCI ’12. ACM, pp. 52–61
Rasooli, A., Down, D.G.: COSHH: a classification and optimization based scheduler for heterogeneous Hadoop systems. High Performance Computing, Networking Storage and Analysis, SC Companion. IEEE, pp. 1284–1291 (2013)
Lu, P., Lee, Y.C., Wang, C., Zhou, B.B., Chen, J., Zomaya, A.Y.: Workload characteristic oriented scheduler for MapReduce. In: 2012 IEEE 18th International Conference on Parallel and Distributed Systems, pp. 156–163 (2012)
He, Y., Tian, C., Zhou, H., Zha, L.: A dynamic MapReduce scheduler for heterogeneous workloads. In: Eighth International Conference on Grid and Cooperative Computing, IEEE 2009, pp. 218–224
Hu, W., Tian, C., Liu, X., Qi, H., Zha, L., Liao, H., Zhang, Y., Zhang, J.: Mutiple-job optimization in MapReduce for heterogeneous workloads. In: 2010 Sixth International Conference on Semantics, Knowledge and Grids, IEEE 2010, pp. 135–140
JobTracker Architecture. http://hadoop.apache.org/common/docs/current/mapred_tutorial.html
Murthy, A.: Next Generation Hadoop [Online]. Available: http://developer.yahoo.com/blogs/hadoop/posts/2011/03/MapReduce-nextgen-scheduler/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer India
About this paper
Cite this paper
Mude, R.G., Betta, A., Debbarma, A. (2015). Capturing Node Resource Status and Classifying Workload for Map Reduce Resource Aware Scheduler. In: Jain, L., Patnaik, S., Ichalkaranje, N. (eds) Intelligent Computing, Communication and Devices. Advances in Intelligent Systems and Computing, vol 309. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2009-1_29
Download citation
DOI: https://doi.org/10.1007/978-81-322-2009-1_29
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2008-4
Online ISBN: 978-81-322-2009-1
eBook Packages: EngineeringEngineering (R0)