research-article

Parallel Computing Framework Based on MapReduce and GPU Clusters

Authors:
Chunlei Xu

State Grid Jiangsu Electric Power Co., Ltd, China, Nanjing, China

State Grid Jiangsu Electric Power Co., Ltd, China, Nanjing, China
View Profile

,
Weijin Zhuang

China Electric Power Research Institute (Nanjing), Nanjing, China

China Electric Power Research Institute (Nanjing), Nanjing, China
View Profile

CSAE '20: Proceedings of the 4th International Conference on Computer Science and Application EngineeringOctober 2020Article No.: 73Pages 1–5https://doi.org/10.1145/3424978.3425051

Published:20 October 2020Publication History

CSAE '20: Proceedings of the 4th International Conference on Computer Science and Application Engineering

Pages 1–5

ABSTRACT

In recent years, driven by hardware technology, the computing power and programmability of GPUs have been rapidly developed. With the characteristics of highly parallel computing, GPUs are no longer limited to daily graphics processing tasks. It begins to involve a wider range of high-performance generalpurpose computing field. One of the hotspots in the field of highperformance parallel computing is MapReduce, a massive data processing framework. Through inexpensive ordinary computer clusters, we can obtain large-scale data computing capabilities that were previously only owned by expensive large servers. However, most existing MapReduce systems run on CPU clusters, and the computing performance of a single node is limited. Therefore, this paper proposes a parallel computing framework based on GPU cluster and MapReduce, and validates the effectiveness of the framework through experiments. Experiments have proven that our framework can complete the work, and it has a significant speedup for large-scale applications.

References

Mittal S (2016). A survey of techniques for approximate computing[J]. ACM Computing Surveys (CSUR), 48(4), 62.Google ScholarDigital Library
CUDA NVIDIA. CUDA Programming Guide.Google Scholar
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung (2003). The Google File System, SOSP 03, New York, NY, USA, pp. 29--43.Google ScholarDigital Library
Jeffrey Dean and Sanjay Ghemawat (2008). MapReduce: Simplyfied Data Processing on Large Clusters. Communications of the ACM, vol. 51, pp. 107--113.Google ScholarDigital Library
Apache Hadoop, http://hadoop.apache.org/.Google Scholar
L Shvachko, Hairong Kuang, S Radia, et al. (2010). The Hadoop Distributed File System. IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1--10.Google Scholar
HBase, http://hbase.apache.org/.Google Scholar
Ranger C, Raghuraman R, Penmetsa A, et al. (2007). Evaluating mapreduce for multi-core and multiprocessor systems[C]. High Performance Computer Architure HPCA. IEEE 13th International Symposium on IEEE, 13--24.Google ScholarDigital Library
Yang C T, Huang C L and Lin C F (2011). Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters[J]. Computer Physics Communications, 182(1), 266--269.Google ScholarCross Ref
He B, Fang W, Luo Q, et al. (2008). Mars: a MapReduce framework on graphics processors[C]. Proceedings of the 17th international conference on Parallel architectures and compilation techniques ACM, 260--269.Google ScholarDigital Library
Hong C, Chen D, Chen W, et al. (2010). MapCG:writing parallel program protable between CPU and GPU[C]. Proceedings of the 19th international conference on Parallel architectures and compilation techniques. ACM, 217--226.Google ScholarDigital Library
Xin Miao and Li Hao (2012). An Implementation of GPU Accelerated MapReduce: Using Hadoop with OpenCL for Data- and Compute-Intensive Jobs. In IJCSS, pp. 6--11.Google Scholar
Styart J A and Qwens J D (2011). Multi-GPU MapReduce on GPU clusters[C]. Parallel & Distributed Processing Symposium (IPDPS), IEEE International. IEEE, 1068--1079.Google ScholarDigital Library
Heng Gao, Jie Tang and Gangshan Wu (2013). A MapReduce Computing Framework Based on GPU Cluster. IEEE Conference, High Performance Computing and Communications & Embedded and Ubiquitous Computing, pages 1902--1907.Google Scholar
Mengjun Xie, Kyoung-Don Kang and Basaran C (2013). Moim: A Multi-GPU MapReduce Framework. IEEE 16th International Conference on Computational Science and Engineering (CSE), pp.1279, 1286, 3-5 Dec.Google Scholar
Yiru Guo, Weiguo Liu, Gong B, Voss G and Muller-Wittig W (2013). GCMR: A GPU Cluster-Based MapReduce Framework for Large- Scale Data Processing. High Performance Computing and Communications & IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), IEEE 10th International Conference on, pp. 580, 586, 13-15 Nov.Google Scholar

Index Terms

Parallel Computing Framework Based on MapReduce and GPU Clusters
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed algorithms
      1. MapReduce algorithms
  2. Parallel computing methodologies
    1. Parallel algorithms

Recommendations

Using distributed memory parallel computers and GPU clusters for multidimensional Monte Carlo integration

The aim of this paper is to show that the multidimensional Monte Carlo integration can be efficiently implemented on various distributed memory parallel computers and clusters of multicore nodes using recently developed parallel versions of linear ...
Read More
Accelerating MapReduce framework on multi-GPU systems

Graphics processors evolve rapidly and promise to support power-efficient, cost, differentiated price-performance, and scalable high performance computing. MapReduce is a well-known distributed programming model to ease the development of applications ...
Read More
An Implementation of GPU Accelerated MapReduce: Using Hadoop with OpenCL for Data- and Compute-Intensive Jobs
IJCSS '12: Proceedings of the 2012 International Joint Conference on Service Sciences

MapReduce is an efficient distributed computing model for large-scale data processing. However, single-node performance is gradually to be the bottleneck in compute-intensive jobs. This paper presents an approach of MapReduce improvement with GPU ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CSAE '20: Proceedings of the 4th International Conference on Computer Science and Application Engineering
October 2020
1038 pages
ISBN:9781450377720
DOI:10.1145/3424978
Conference Chair:
Ali Emrouznejad,
Program Chair:
Jui-Sheng Rayson Chou
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 October 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
GPU clusters
MapReduce
Parallel computing framework
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
CSAE '20 Paper Acceptance Rate179of387submissions,46%Overall Acceptance Rate368of770submissions,48%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 86
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Parallel Computing Framework Based on MapReduce and GPU Clusters

CSAE '20: Proceedings of the 4th International Conference on Computer Science and Application Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Using distributed memory parallel computers and GPU clusters for multidimensional Monte Carlo integration

Accelerating MapReduce framework on multi-GPU systems

An Implementation of GPU Accelerated MapReduce: Using Hadoop with OpenCL for Data- and Compute-Intensive Jobs