Static resource allocation for heterogeneous computing environments with tasks having dependencies, priorities, deadlines, and multiple versions
Introduction
Mixed-machine heterogeneous computing (HC) environments utilize a distributed suite of different machines, interconnected with high-speed links, to perform collections of different tasks with diverse computational requirements (e.g., [18], [34], [44]). Such an environment coordinates the execution of tasks on machines within the system to exploit different capabilities among machines to attempt optimizing some performance metric [30].
Across the machines in an HC suite, the individual processors can vary by many factors, including age, CPU instruction sets, clock speeds, cache structures, cache sizes, memory sizes, and communication bandwidths. A cluster composed of different types (or models) of machines also constitutes an HC system. Alternatively, a cluster could be treated as a single machine in an HC suite. An HC system could also be part of a local area grid within an institution [16].
The act of assigning (matching) each task to a machine and ordering (scheduling) the execution of the tasks on each machine, and communication among machines is key to coordinating and exploiting an HC system to its fullest extent. This matching and scheduling is variously referred to as mapping, resource allocation, and resource management in the literature. We will refer to this process as mapping in this paper. The general mapping problem has been shown to be NP-complete (e.g., [10], [24]). There are a variety of heuristic techniques that can be considered for different environments [1].
This study considers a collection of task characteristics in anticipated military environments (e.g., AICE [37], [51], HiPer-D [54], and MSHN [21]). They make the HC mapping problem quite complex. Therefore, intelligent, efficient heuristics are sought to attempt to solve this recalcitrant problem.
In this study, tasks may be atomic or decomposable. Atomic tasks have no internal communications. Decomposable tasks consist of two or more communicating subtasks. Subtasks have data dependencies, but can be mapped to different machines. If there are communicating subtasks within a task, inter-machine data transfers may need to be performed when multiple machines are used.
Tasks in this study have deadlines, priorities, and multiple versions. The deadlines are hard deadlines. The priority levels have associated weights to quantify their relative importance. The multiple versions of tasks provide alternatives for executing tasks. Those alternatives are of different preferences to the system users, and have different resource requirements.
The performance measure is the sum of weighted priorities of tasks that completed before their deadline, adjusted based on the version of the task executed. Simulations and performance bound calculations are used to evaluate and compare several heuristic techniques in several overloaded system scenarios, where not all tasks can complete by their deadlines.
Heuristics developed to perform mapping are often difficult to compare because of different underlying assumptions in the original study of each heuristic [1]. In [9], eleven heuristic techniques are investigated and directly compared for static mappings in a simpler environment. The much simpler environment in the [9] study has the goal of minimizing the total execution time of a set of independent tasks. These tasks do not have deadlines, priorities, versions, or inter-subtask communications. The eleven techniques cover a wide range of methods from the resource management and optimization literature. They are: Opportunistic Load Balancing [4], [17], [18], Minimum Execution Time [4], [17], Minimum Completion Time [4], Min–min [4], [17], [24], Max-min [4], [17], [24], Duplex [4], [17], Genetic Algorithm [22], [36], [53], Simulated Annealing [29], [36], [39], Genetic Simulated Annealing [43], Tabu Search [20], [36], and A* [26], [36], [39], [42].
Based on the results of that study, three static (off-line) techniques are selected, adapted, and applied to this more complex mapping problem: multiple variations of a two phase greedy method (based on the concept of Min–min), a standard genetic algorithm (GA), and the GENITOR approach [55] (a variation of the GA). Simulation studies are used to compare these heuristics in several different overloaded scenarios, i.e., not all tasks will be able to complete by their deadlines.
This research considers a type of environment, which may be found in certain military and commercial situations, where information needed to execute tasks arrives at predetermined times, e.g., when satellites are positioned appropriately to generate images of a given region or when scheduled weather updates are available [49], [50]. This leads to several questions, such as: (a) if a given heterogeneous suite of machines can execute all of their tasks by their deadlines, and (b) if there are machine failures so that the system becomes overloaded, which tasks (and which version of each of these tasks) to execute based on their priorities. Because the arrival times of needed data are known a priori, the times at which the tasks can begin to execute (i.e., the tasks arrive) are known a priori. This research focuses on part (b), and designs heuristics to select tasks and their versions, based on priorities and deadlines, that will maximize the performance measure stated earlier. By conducting simulation studies with different subsets of the intended full machine suite, we can determine how much performance degradation occurs as a result of any particular machine failure. The question in (a) can be answered using the same heuristic techniques when all machines are considered to be operational.
The research in our paper makes the following contributions:
- •
A new HC paradigm is used. Tasks have deadlines, priorities, multiple versions, and may have communicating subtasks. Multiple overloaded scenarios are considered, i.e., not all tasks meet their deadline.
- •
Methods for performance bound calculations are proposed to evaluate the performances of different heuristic techniques.
- •
Several heuristics are developed, adapted, and applied to this version of the mapping problem.
- •
Customized chromosome structures and operations are developed for the GA and GENITOR.
- •
The results show that GENITOR finds the best solutions, compared to the GA and the two phase greedy approach; however the two phase greedy technique runs in less time.
The remainder of this paper is structured as follows. Section 2 describes the details of the HC environment. The mapping heuristics are defined in Section 3. In Section 4, the results from the simulation studies are examined. Section 5 examines some of the literature related to this work. Section 6 summarizes this research.
Section snippets
Static mapping heuristics
Two different types of mapping are static and dynamic mapping [1]. Static mapping is performed when tasks are mapped in an off-line planning phase, e.g., planning the mapping for tomorrow in a production environment (e.g., [9]). Static mapping techniques take a fixed set of applications, a fixed set of machines, and a fixed set of application and machine attributes as inputs and generate a single, fixed mapping (e.g., [1], [9], [13], [14], [32], [34], [41]). Dynamic mapping is performed when
Greedy mapping heuristics
As a baseline for comparison, consider a greedy FIFO technique, referred to as the Minimum Current Fitness (MCF) technique. MCF considers m-tasks for mapping in ascending order of arrival time; it maps m-tasks to the machine that can complete the best (lowest) version possible by that task’s deadline. If no machine/version combination can complete the m-task before its deadline, the task is not mapped. Version coherency for a decomposable task is strictly enforced. That is, the initial version
Results from simulation studies
Three different HC mapping test case scenarios are examined: (1) highly-weighted priorities and high arrival rate, (2) lightly-weighted priorities and moderate arrival rate, and (3) lightly-weighted priorities and high arrival rate. Each result reported is the average of 50 different trials (i.e., matrices), with m-tasks, machines, and versions. The 95% confidence interval for each heuristic is shown at the top of each bar [25], and GENITOR is abbreviated as GEN.
The average
Related work
The work in [31] investigates dynamic resource allocation for tasks with priorities and multiple deadlines in an HC environment and compares the performance of mapping heuristics to two static mapping schemes, simulated annealing and a genetic algorithm. The problem differs from ours because the tasks in [31] (a) are independent (i.e., there is no inter-task communication), (b) have soft deadlines instead of one hard deadline, and (c) do not have multiple versions.
In [38], the authors propose a
Summary
This paper presents a new paradigm for an HC environment, where tasks have deadlines, priorities, multiple versions, and communicating subtasks. Two upper bounds, multiple variations of TPF, and two kinds of genetic algorithms are implemented and compared.
It is shown that the TPF approach performs very well, achieving 59% to 76% of the upper bound. The generational GA approach, seeded with TPF, improves these mappings by only 2% to 3%, achieving 61% to 78% of the upper bound. The GENITOR
Acknowledgments
The authors thank Shoukat Ali, Luis Diego Briceño, David Leon Janovy, Jong-Kook Kim, Loretta Vandenberg, Samee U. Khan, and Darrell Whitley for their comments. A preliminary version of portions of this paper was presented at the 16th International Parallel and Distributed Processing Symposium.
This research was supported in part by the NSF under grant CNS-0615170, by the DARPA/ITO Quorum Program under GSA subcontract number GS09K99BH0250, by a Purdue University Dean of Engineering Donnan
Tracy Braun (CISSP, CCNA) received a Ph.D. in Electrical and Computer Engineering from Purdue University in 2001. Dr. Braun also received a Master’s Degree in Electrical and Computer Engineering from Purdue University in 1997. Dr. Braun graduated with Honors and High Distinction from the University of Iowa in 1995, with a Bachelor of Science degree in Electrical and Computer Engineering. Dr. Braun’s current research interests include computer security, information assurance, and reverse
References (56)
- et al.
Characterizing resource allocation heuristics for heterogeneous computing systems
- et al.
Task allocation for maximizing reliability of distributed systems: A simulated annealing approach
Journal of Parallel and Distributed Computing
(2006) - et al.
A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems
Journal of Parallel and Distributed Computing
(2001) - et al.
An integrated technique for task matching and scheduling onto distributed heterogeneous computing systems
Journal of Parallel and Distributed Computing
(2002) - et al.
Scheduling of a meta-task with QoS requirements in heterogeneous computing systems
Journal of Parallel and Distributed Computing
(2006) - et al.
Dynamically mapping tasks with priorities and multiple deadlines in a heterogeneous environment
Journal of Parallel and Distributed Computing
(2007) - et al.
A semi-static approach to mapping dynamic iterative tasks onto heterogeneous computing systems
Journal of Parallel and Distributed Computing
(2006) - et al.
A dynamic and reliability-driven scheduling algorithm for parallel real-time jobs executing on heterogeneous clusters
Journal of Parallel and Distributed Computing
(2005) - et al.
Task matching and scheduling in heterogeneous computing environments using a genetic-algorithm-based approach
Parallel Evolutionary Computing
Journal of Parallel and Distributed Computing
(1997) - et al.
Measuring the robustness of a resource allocation
IEEE Transactions on Parallel and Distributed Systems
(2004)
Representing task and machine heterogeneities for heterogeneous computing systems
Tamkang Journal of Science and Engineering
The Grid: Blueprint for a New Computing Infrastructure
Heterogeneous processing
IEEE Computer
Distributed heterogeneous supercomputing management system
IEEE Computer
Tabu Search
Adaptation in Natural and Artificial Systems
Heuristic algorithms for scheduling independent tasks on nonidentical processors
Journal of the ACM
Cited by (54)
Hyper switching memory utilization on hybrid main memory for improved task execution and reduced power consumption
2020, Microprocessors and MicrosystemsScalable system scheduling for HPC and big data
2018, Journal of Parallel and Distributed ComputingCitation Excerpt :Workload modeling is extensively treated in [25], while [47] covers a very large set of scheduling algorithms. An extensive study and comparison of mapping and scheduling algorithms was reported in [7], with refinements in [8]. Further refinements of the underlying statistical models and parameter set generation was published in [14,13].
QoS guaranteeing robust scheduling in attack resilient cloud integrated cyber physical system
2017, Future Generation Computer SystemsCitation Excerpt :There is very little previous work in the field of security driven scheduling algorithms as most researches do not focus on that problem. Of the existing works on security driven scheduling, the QSMTS-IP [8] algorithm is capable of meeting diverse QoS requirements which includes security for multiple users, It focuses on minimizing the number of users whose tasks cannot be completed due to resource limitations. In [9], authors studied dynamic security-aware scheduling algorithms for single machine, homogeneous cluster, and heterogeneous distributed system.
Development and analysis of a three phase cloudlet allocation algorithm
2017, Journal of King Saud University - Computer and Information SciencesCitation Excerpt :Genetic Algorithms (Pop, 2008) are also used for scheduling. The Opportunistic Load Balancing (OLB) heuristic (Braun et al., 2008) (Makelainen et al., 2014) (Iordache et al., 2007) chooses a cloudlet from the batch of cloudlets arbitrarily and allocates it to the next VM which is estimated to be available, not considering the cloudlet’s expected execution time on that VM, resulting in very poor makespan (Wang et al., 2006). The Minimum Execution Time (MET) (George Amalarethinam and Muthulakshmi, 2011) allocates each cloudlet chosen arbitrarily to the VM with the least possible execution time resulting in the severe imbalance of load across the VMs.
In-Network Pooling: Contribution-Aware Allocation Optimization for Computing Power Network in B5G/6G Era
2023, IEEE Transactions on Network Science and Engineering
Tracy Braun (CISSP, CCNA) received a Ph.D. in Electrical and Computer Engineering from Purdue University in 2001. Dr. Braun also received a Master’s Degree in Electrical and Computer Engineering from Purdue University in 1997. Dr. Braun graduated with Honors and High Distinction from the University of Iowa in 1995, with a Bachelor of Science degree in Electrical and Computer Engineering. Dr. Braun’s current research interests include computer security, information assurance, and reverse engineering.
Howard Jay Siegel was appointed the Abell Endowed Chair Distinguished Professor of Electrical and Computer Engineering at Colorado State University (CSU) in 2001, where he is also a Professor of Computer Science. He is the Director of the CSU Information Science and Technology Center (ISTeC), a university-wide organization for promoting, facilitating, and enhancing CSUs research, education, and outreach activities pertaining to the design and innovative application of computer, communication, and information systems. From 1976 to 2001, he was a professor at Purdue University. Prof. Siegel is a Fellow of the IEEE and a Fellow of the ACM. He received two B.S. degrees from the Massachusetts Institute of Technology (MIT), and the M.A., M.S.E., and Ph.D. degrees from Princeton University. He has co-authored over 340 technical papers. His research interests include heterogeneous parallel and distributed computing, parallel algorithms, and parallel machine interconnection networks. He was a Co-editor-in-Chief of the Journal of Parallel and Distributed Computing, and was on the Editorial Boards of both the IEEE Transactions on Parallel and Distributed Systems and the IEEE Transactions on Computers. He has been an international keynote speaker and tutorial lecturer, and has consulted for industry and government. For more information, please see www.engr.colostate.edu/~hj.
Anthony A. Maciejewski received the BSEE, MS, and PhD degrees from Ohio State University in 1982, 1984, and 1987. From 1988 to 2001, he was a professor of Electrical and Computer Engineering at Purdue University, West Lafayette. He is currently the Department Head of Electrical and Computer Engineering at Colorado State University. He is a Fellow of the IEEE. A complete vita is available at: www.engr.colostate.edu/~aam.
Ye Hong is pursuing his Ph.D. degree in Electrical and Computer Engineering at Colorado State University. He received his Master degree in Computer Science from Tsinghua University in 2006, and his Bachelor degree from Tsinghua University in 2002. His current research interests include parallel and distributed computing, heterogeneous computing, robust computer systems, and performance evaluation.