Skip to main content

An Efficient MapReduce Framework for Intel MIC Cluster

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9243))

Abstract

MapReduce is a distributed programming framework to process large scale data set by employing clusters in scale-out ways. However, scaling-up the single node is better than scale-out solution because of less communication overhead. As Intel MIC has a higher performance than ordinary CPU, we propose an efficient MapReduce framework for Intel MIC cluster. Our framework provides several new features, such as fault tolerant mechanism for MIC management, efficient buffer management in MIC memory, and asynchronous task transfer between CPU and MIC. It could manage a large scale MIC cluster and exploit applications in MapReduce like ways. The experimental results show that our system is up to 1.35x and 6.8x faster than Hadoop on ordinary CPU cluster.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  2. Appuswamy, R., Gkantsidis, C., Narayanan, D., Hodson, O., Rowstron, A.: Scale-up vs scale-out for hadoop: time to rethink?. In: Proceedings of the 4th Annual Symposium on Cloud Computing, p. 20. ACM Press (2013)

    Google Scholar 

  3. He, B., Fang, W., Luo, Q., Govindaraju, N.K., Wang, T.: Mars: a MapReduce framework on graphics processors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp. 260–269. ACM Press, Toronto (2008)

    Google Scholar 

  4. Stuart, J.A., Owens, J.D.: Multi-GPU MapReduce on GPU clusters. In: 25th IEEE International Parallel & Distributed Processing Symposium, pp. 1068–1079. IEEE Press, Anchorage, Alaska (2011)

    Google Scholar 

  5. Heinecke, A., Klemm, M., Pflger, D., Bode, A., Bungartz, H.J.: Extending a highly parallel data mining algorithm to the intel many integrated core architecture. In: Alexander, M., et al. (eds.) Euro-Par 2011: Parallel Processing Workshops. LNCS, vol. 7156, pp. 375–384. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  6. Schulz, K.W., Ulerich, R., Malaya, N., Bauman, P.T., Stogner, R., Simmons, C.: Early experiences porting scientific applications to the Many Integrated Core (MIC) platform. In: TACC-Intel Highly Parallel Computing Symposium. Austin, Texas (2012)

    Google Scholar 

  7. Lu, M., Zhang, L., Huynh, H. P., Ong, Z., Liang, Y., He, B., Huynh, R.: Optimizing the mapreduce framework on intel xeon phi coprocessor. In: International Conference on Big Data, pp. 125–130. IEEE Press, Santa Clara, California (2013)

    Google Scholar 

  8. Basaran, C., Kang, K.D.: Grex: an efficient MapReduce framework for graphics processing units. J. Parallel Distrib. Comput. 73(4), 522–533 (2013)

    Article  Google Scholar 

  9. Hong, C., Chen, D., Chen, W., Zheng, W., Lin, H.: MapCG: writing parallel program portable between CPU and GPU. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, pp. 217–226. ACM Press, Vienna (2010)

    Google Scholar 

  10. Chen, L., Huo, X., Agrawal, G.: Accelerating mapreduce on a coupled cpu-gpu architecture. In: International Conference for High Performance Computing, Networking, Storage and Analysis, p. 25. IEEE Press, Salt Lake, Utah (2012)

    Google Scholar 

  11. Farivar, R., Verma, A., Chan, E.M., Campbell, R.H.: Mithra: Multiple data independent tasks on a heterogeneous resource architecture. In: IEEE International Conference on Cluster Computing, pp. 1–10. IEEE Press, New Orleans, Louisiana (2009)

    Google Scholar 

  12. Chen, Y., Qiao, Z., Jiang, H., Li, K.-C., Ro, W.W.: MGMR: Multi-GPU based MapReduce. In: Park, J.J.J.H., Arabnia, H.R., Kim, C., Shi, W., Gil, J.-M. (eds.) GPC 2013. LNCS, vol. 7861, pp. 433–442. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  13. Fang, W., He, B., Luo, Q., Govindaraju, N.K.: Mars: accelerating mapreduce with graphics processors. IEEE Trans. Parallel Distrib. Syst. 22(4), 608–620 (2011)

    Article  Google Scholar 

  14. Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating mapreduce for multi-core and multiprocessor systems. In: IEEE 13th International Symposium on High Performance Computer Architecture, pp. 13–24. IEEE Press, Phoenix, Arizona (2007)

    Google Scholar 

  15. Talbot, J., Yoo, R.M., Kozyrakis, C.: Phoenix++: modular MapReduce for shared-memory systems. In: Proceedings of the Second International Workshop on MapReduce and its Applications, pp. 9–16. ACM Press, San Jose, California (2011)

    Google Scholar 

  16. de Kruijf, M., Sankaralingam, K.: MapReduce for the Cell BE architecture. University of Wisconsin Computer Sciences Technical report CS-TR-2007-1625 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenzhu Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, W., Wu, Q., Tan, Y., Zhang, Y. (2015). An Efficient MapReduce Framework for Intel MIC Cluster. In: He, X., et al. Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques. IScIDE 2015. Lecture Notes in Computer Science(), vol 9243. Springer, Cham. https://doi.org/10.1007/978-3-319-23862-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23862-3_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23861-6

  • Online ISBN: 978-3-319-23862-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics