A parallel Bees Algorithm implementation on GPU

https://doi.org/10.1016/j.sysarc.2013.09.007Get rights and content

Abstract

Bees Algorithm is a population-based method that is a computational bound algorithm whose inspired by the natural behavior of honey bees to finds a near-optimal solution for the search problem. Recently, many parallel swarm based algorithms have been developed for running on GPU (Graphic Processing Unit). Since nowadays developing a parallel Bee Algorithm running on the GPU becomes very important. In this paper, we extend the Bees Algorithm (CUBA (i.e. CUDA based Bees Algorithm)) in order to be run on the CUDA (Compute Unified Device Architecture). CUBA (CUDA based Bees Algorithm). We evaluate the performance of CUBA by conducting some experiments based on numerous famous optimization problems. Results show that CUBA significantly outperforms standard Bees Algorithm in numerous different optimization problems.

Introduction

Finding an optimal solution for the search problem becomes an important research question nowadays [21], [22], [23]. There are increasingly swarm intelligence [4] which is in nature the collective behavior of social animals used for finding a near optimal solution. The swarm-based optimization algorithms (SOAs) drive a search towards the optimal solution. Various algorithms, such as Ant Colony Optimization (ACO) proposed by Marco Dorigo [1], Genetic Algorithm (GA) [24], Particle swarm optimization (PSO) [3] developed by Kennedy, Artificial Bee Colony Algorithm (ABC) by proposed D. Karaboga [5], and Bees Algorithm proposed by DT Pham [2], modeled the behaviors of the swarm of animals with social organization. Self-organization is one of the system features that gets global-level response by means of many different low-level interactions.

In the SOAs, the ACO algorithm is a non-greedy population-based algorithm which emulates the behavior of real ants. The GA is based on natural selection and genetic recombination. It efficiently exploits historical information to speculate on new search areas with improved performance. The PSO is an optimization procedure based on the social behavior of groups of organizations. And the ABC is also another optimization algorithm inspired on the intelligent behavior of honey bee swarms. Bees Algorithm (BA) [2] is also a population-based method to search optimization of the problems which is inspired by the behavior of honey bees [2], [6]. The algorithm performs a kind of neighborhood search combined with random search and can be used for both combinatorial optimization [25], [26] and functional optimization [2]. Based on the BA, researchers have come up with several real-world applications such as data mining [7], robot controlling [8], electronic engineering [9], job scheduling [10], E-Testing [35], task allocation [36], and so on, based on Bees Algorithm.

The swarm-based optimization algorithms have been widely used to accelerate the performance of the search problems Parallelization technique is often used in the various swarm intelligence, such as a parallel implementation of ant colony optimization [27], [28], parallel genetic algorithm (PGA) [29], [30], parallel global optimization with the particle swarm algorithm [31], parallel Bees Algorithm (PBA) [18], and parallel artificial bee colony (PABC) algorithm [32] etc. The proposed parallelization strategy does not only degrade the quality of solutions obtained, but also achieves substantial speedup. In [28], authors discussed parallelization strategies for Ant Colony Optimization algorithms and empirically tested the simplest strategy, which of executing parallel independent runs of an algorithm. In [30], the PGA uses a mixed strategy. Subpopulations try to locate good local minima. In [31], parallel PSO performance was evaluated using two categories of optimization problems possessing multiple local minima—large-scale analytical test problems with computationally cheap function evaluations and medium-scale biomechanical system identification problems with computationally expensive function evaluations. The authors in [32] presented a parallel version of the algorithm for shared memory architectures. The entire colony of bees was divided equally among the available processors.

Graphic Processing Units (GPU) is a highly fast parallel microprocessor. There are many stream processors in a multiprocessor and each stream processor is a smallest computational unit. There is shared memory in a multiprocessor among numerous stream processors, they could communicate with each other by using shared memory. The GPU can accelerate computations and applications running on the CPU by loading parts of the code with high compute-loading. The NVIDIA [11], [12] provides CUDA that is a general purpose parallel programming model, thus the programmers do not need to consider the complex low-level issues of GPU. Many algorithms and applications have been implemented on GPU for obtaining better performance. It supports many graphic programming APIs (Application Programming Interfaces), so developers do not have to consider more complexity of low-level problems while programming with CUDA [11], [12]. Much work regarding to parallel swam intelligence algorithm has be done on GPU, such as Ant Colony Optimization [13], [14], [15], Genetic Algorithm [16], [17], Particle swarm optimization [33], [37], and so on. These GPU-based implementations of swarm-based optimization algorithms have proven that the GPU can be applied to significantly improve the performance of the algorithms.

Based on our knowledge, only the authors in [18] adopted Parallel Bees Algorithms (PBA) to simultaneously search the location, size and types of FACTS devices to enhance ATC between sources and sink area. Obviously, it is not generally used to solve the search problem to find the near optimal solution. In addition, in [34], authors adopted another hardware implementation (FPGA) to implement the ABC algorithm. Therefore, no attention has been paid to implement parallel Bee Algorithm on GPU yet. In order to significantly improve the performance of PBA, the objective of this paper is to design and implement a novel Parallel Bees Algorithm running on GPU. We choose CUDA framework to implement our multi-colonies Bees Algorithm on GPU called CUBA and design a new parallel multi-colonies Bees Algorithm that bring good efficiency. In the proposed algorithm, we group the threads within a block to several colonies. Each thread is assigned to a honey bee to search the solution for its colony. The proposed algorithm divides a block into different colonies by thread ID, and running Bees Algorithm independently. We evaluate the performance of CUBA by conducting some experiments based on numerous famous optimization problems. The result shows the CUBA significantly outperforms traditional BA in numerous different optimization problems.

The rest of the paper is organized as follows. Section 2 reviews the background of the Bees Algorithm and survey some related work regarding to GPU-based implementations of swarm-based optimization algorithms. Section 3 shows the methods how to parallel the Bees Algorithm to running on GPU. Section 4 evaluates and discusses the result of our experiments including the comparison of the Bees Algorithm and the CUDA-based Bees Algorithm. Finally, we give a concluding remarks and future work in the Section 5.

Section snippets

Bee colony optimization

In reality, there are various natural systems (i.e. social insects colonies) such as the Ants Colony and the Bees Colony in which simple individual organisms can create systems that are able to perform highly complex tasks by dynamically interacting with each other. In general, the honey bee colony consists of three kinds of adult bees: workers, drones, and a queen. Although each member in the honey bee colony has a definite task to perform, a lot of worker bees need to cooperate to complete

Parallel Bees Algorithm on GPU

The major key of deciding the accelerated effect is the level of parallelization. In standard Bees Algorithm, most computational loads are in the neighbourhood search procedure. A naïve method is to take the neighbourhood search procedure as a kernel to distribute the computations in loop of the procedure. In fact, the optimal number of the neighbourhood size is fluctuant according to different features of functions. However, if the size of the neighbourhood is not larger than the number of

Experimental results and analysis

To evaluate the performance of the proposed algorithm, we conducted some experiments to evaluate and compare both of the execution time with CUDA on GPU and the execution time with C++ on CPU to verify efficiency of the algorithm. The configuration of the evaluation platform are shown as Table 1 and described as flows: We adopt AMD Athlon (tm) II and GeForce GTX 460 for our computation platform. The host is AMD Athlon(tm) II which has 4 cores, and each core has clock rate with 3.0 GHz. The

Conclusion and future work

In this paper, first we have proposed Parallel Bees Algorithm based on CUDA. We modify the local search procedure. Running in SIMT (Single Instruction Multiple Threads) hardware architecture, we merge the two parts of the local searching sites avoiding wasting the computing powers of GPU. For the same reason, we have no site abandonment procedure. In addition, we let the bees recruiting in different sites maintain own ngh, meaning they shrink independently. We sort the colonies in the same

Acknowledgements

We are grateful for the many excellent comments and suggestions made by the anonymous referees. This work was supported in part by the Nation Science Council of Republic of China under Grant no. NSC 101-2221-E-009-034-MY2.

Sheng-Kai Huang received his Master Degree from Institute of Computer Science and Engineering of National Chiao Tung University in 2012. He is now with Telecommunication Laboratories Chunghwa Telecom Co., Ltd. His research interests are in Parallel, Cloud and Mobile Computing.

References (40)

  • Audrey Delévacq et al.

    Parallel ant colony optimization on graphics processing units

    Journal of Parallel and Distributed Computing

    (January 2013)
  • H. Mühlenbein et al.

    The parallel genetic algorithm as function optimizer

    Parallel Computing

    (1991)
  • Luca Mussi et al.

    Evaluation of parallel particle swarm optimization algorithms within the CUDA architecture

    Information Sciences

    (2011)
  • M. Dorigo, Optimization, Learning and Natural Algorithms, Ph.D. thesis, Politecnico di Milano, Italie,...
  • D.T. Pham, E. Koc, A. Ghanbarzadeh, S. Otri, S. Rahim, M. Zaidi, The Bees Algorithm–a novel tool for complex...
  • J. Kennedy, R. Eberhart, Particle swarm optimization, in: Proceedings of IEEE International Conference on Neural...
  • E. Bonabeau et al.

    Swarm Intelligence: From Natural to Artificial Systems

    (1999)
  • D. Karaboga et al.

    A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm

    Global Optimization

    (2007)
  • D.T. Pham et al.

    The Bees Algorithm: modelling foraging behaviour to solve continuous optimization problems

    Proceeding of Institute Mechanical Engineering, C: Journal of Mechanical Engineering and Science

    (2009)
  • D.T. Pham, S. Otri, A. Afify, M. Mahmuddin, H. Al-Jabbouli, Data clustering using the Bees Algorithm, in: Proceedings...
  • D.T. Pham, A.H. Darwish, E.E. Eldukhri, S. Otri, Using the Bees Algorithm to tune a fuzzy logic controller for a robot...
  • K. Guney et al.

    Bees Algorithm for design of dual-beam linear antenna arrays with digital attenuators and digital phase shifters

    International Journal of RF and Microwave Computer-Aided Engineering

    (2008)
  • D.T. Pham, E. Koc, J.Y. Lee, J. Phrueksanant, Using the Bees Algorithm to schedule jobs for a machine, in: Proc Eighth...
  • NVIDIA CUDA Programming Guide Version 4.2: NVIDIA Corporation,...
  • NVIDIA CUDA Best Practices Guild, 4.2 edition, NVIDIA Corporation,...
  • Jianming Li, Xiangpei Hu, Zhanlong Pang, Kunming Qian, A parallel Ant colony optimization algorithm based on...
  • José M. Cecilia, José M. García, Andy Nisbet, Martyn Amos, Manuel Ujaldón, Enhancing data parallelism for Ant Colony...
  • W.B. Langdon

    Graphics processing units and genetic programming: an overview

    Soft Computing

    (August 2011)
  • Petr. Pospichal et al.

    Parallel genetic algorithm on the CUDA architecture

    Lecture Notes in Computer Science

    (2010)
  • A.K.R. Mohamad Idris, M.W. Mustafa, A Parallel Bees Algorithm for ATC enhancement in modern electrical network, in:...
  • Cited by (0)

    Sheng-Kai Huang received his Master Degree from Institute of Computer Science and Engineering of National Chiao Tung University in 2012. He is now with Telecommunication Laboratories Chunghwa Telecom Co., Ltd. His research interests are in Parallel, Cloud and Mobile Computing.

    Guo-Heng Luo received his Master Degree from Institute of Computer Science and Engineering of National Chiao Tung University in 2009. He is now a Ph.D. student with the Institute of Computer Science and Engineering, National Chiao Tung University. His research interests are in Web 2.0, Parallel and Cloud Computing.

    Yue-Shan Chang received his PhD Degree from Computer and Information Science at the National Chiao Tung University in 2001. He joined the Department of Electronic Engineering of the Ming Hsing University of Science and Technology in August 1992. Since August 2004, he joined the Department of Computer Science and Information Engineering, National Taipei University, Taipei County, Taiwan. Since August 2010, he had been a Professor. His research interests are in distributed systems, web service composition, information retrieval, mobile computing and grid computing.

    Shyan-Ming Yuan received his BSEE degree from National Taiwan University in 1981, his MS degree in Computer Science from University of Maryland, Baltimore County in 1985, and his PhD degree in Computer Science from the University of Maryland College Park in 1989. Dr. Yuan joined the Electronics Research and Service Organization, Industrial Technology Research Institute as a Research Member in October 1989. Since September 1990, he has been an Associate Professor at the Department of Computer and Information Science, National Chiao Tung University, Hsinchu, Taiwan. He became the Professor in June 1995. His current research interests include Distributed Objects, Internet Technologies, and Software System Integration. Dr. Yuan is a member of ACM and IEEE.

    View full text