Skip to main content

Capturing Node Resource Status and Classifying Workload for Map Reduce Resource Aware Scheduler

  • Conference paper
  • First Online:
Intelligent Computing, Communication and Devices

Abstract

There has been an enormous growth in the amount of digital data, and numerous software frameworks have been made to process the same. Hadoop MapReduce is one such popular software framework which processes large data on commodity hardware. Job scheduler is a key component of Hadoop for assigning tasks to node. Existing MapReduce scheduler assigns tasks to node without considering node heterogeneity, workload type, and the amount of available resources. This leads to overburdening of node by one type of job and reduces the overall throughput. In this paper, we propose a new scheduler which capture the node resource status after every heartbeat, classifies jobs into two types, CPU bound and IO bound, and assigns task to the node which is having less CPU/IO utilization. The experimental result shows an improvement of 15–20 % on heterogeneous and around 10 % of homogeneous cluster with respect to Hadoop native scheduler.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Technical Report, Google (2004)

    Google Scholar 

  2. Hadoop Distributed File System. http://hadoop.apache.org/common/docs/current/hdfs_design.html

  3. Hadoop MapReduce. http://hadoop.apache.org/MapReduce

  4. Amazon Elastic Map Reduce, http://aws.amazon.com/elasticmapreduce/

  5. Joseph, A.D., Katz, R., Zaharia, M., Konwinski, A., Stoica, I.: (2008) Improving MapReduce performance in heterogeneous environments. In: OSDI’08. USENIX Association, Berkeley, pp. 29–42 (2008)

    Google Scholar 

  6. Fair Scheduler. http://hadoop.apache.org/MapReduce/docs/r0.21.0/fairscheduler.html

  7. Yahoo! Inc. Capacity Scheduler. http://developer.yahoo.com/blogs/hadoop/posts/2011/02/capacity-scheduler/

  8. Chen, Q., Zhang, D., Guo, M., Deng, Q., Guo, S.: SAMR: a self-adaptive MapReduce scheduling algorithm in heterogeneous environment. In: 10th IEEE International Conference on Computer and Information Technology (CIT 2010), pp. 2376–2743 (2010)

    Google Scholar 

  9. Arun kumar, K., Konishetty, V.K., Voruganti, K., Prabhakara Rao, G.V.: CASH: context aware scheduler for Hadoop. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics, New York, 2012, ICACCI ’12. ACM, pp. 52–61

    Google Scholar 

  10. Rasooli, A., Down, D.G.: COSHH: a classification and optimization based scheduler for heterogeneous Hadoop systems. High Performance Computing, Networking Storage and Analysis, SC Companion. IEEE, pp. 1284–1291 (2013)

    Google Scholar 

  11. Lu, P., Lee, Y.C., Wang, C., Zhou, B.B., Chen, J., Zomaya, A.Y.: Workload characteristic oriented scheduler for MapReduce. In: 2012 IEEE 18th International Conference on Parallel and Distributed Systems, pp. 156–163 (2012)

    Google Scholar 

  12. He, Y., Tian, C., Zhou, H., Zha, L.: A dynamic MapReduce scheduler for heterogeneous workloads. In: Eighth International Conference on Grid and Cooperative Computing, IEEE 2009, pp. 218–224

    Google Scholar 

  13. Hu, W., Tian, C., Liu, X., Qi, H., Zha, L., Liao, H., Zhang, Y., Zhang, J.: Mutiple-job optimization in MapReduce for heterogeneous workloads. In: 2010 Sixth International Conference on Semantics, Knowledge and Grids, IEEE 2010, pp. 135–140

    Google Scholar 

  14. JobTracker Architecture. http://hadoop.apache.org/common/docs/current/mapred_tutorial.html

  15. Murthy, A.: Next Generation Hadoop [Online]. Available: http://developer.yahoo.com/blogs/hadoop/posts/2011/03/MapReduce-nextgen-scheduler/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ravi G. Mude .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer India

About this paper

Cite this paper

Mude, R.G., Betta, A., Debbarma, A. (2015). Capturing Node Resource Status and Classifying Workload for Map Reduce Resource Aware Scheduler. In: Jain, L., Patnaik, S., Ichalkaranje, N. (eds) Intelligent Computing, Communication and Devices. Advances in Intelligent Systems and Computing, vol 309. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2009-1_29

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2009-1_29

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2008-4

  • Online ISBN: 978-81-322-2009-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics