Abstract

Due to monetary limitation, small organizations cannot afford high end supercomputers to solve highly complex tasks. P2P (peer to peer) grid computing is being used nowadays to break complex task into subtasks in order to solve them on different grid resources. Workflows are used to represent these complex tasks. Finishing such complex task in a P2P grid requires scheduling subtasks of workflow in an optimized manner. Several factors play their part in scheduling decisions. The genetic algorithm is very useful in scheduling DAG (directed acyclic graph) based task. Benefit of a genetic algorithm is that it takes into consideration multiple criteria while scheduling. In this paper, we have proposed a precedence level based genetic algorithm (PLBGSA), which yields schedules for workflows in a decentralized fashion. PLBGSA is compared with existing genetic algorithm based scheduling techniques. Fault tolerance is a desirable trait of a P2P grid scheduling algorithm due to the untrustworthy nature of grid resources. PLBGSA handles faults efficiently.

1. Introduction

As the complexity of computational problems is increasing continuously, efficient use of computational resources becomes vital. Complex tasks are causing bottlenecks in performance throughout the technical arena. Organizations around the world use high-end computational devices, servers, and supercomputers to handle complex tasks. However, all organizations are not able to purchase such devices because of budget constraints. Grid computing has come up as a crusader to solve a highly complex task [1, 2]. Grid utilizes existing heterogeneous computational devices spread across multiple geographical locations [3]. This unification of computational resources yields manifold increase in computational capabilities. Initially central scheduler based scheduling algorithms were used by researchers to solve complex problems [4]. These techniques were effective in scheduling [5] complex task; however, they have many limitations, like the fact that failure of central scheduler causes collapse of the entire grid [6]. Limited capabilities of the central scheduler give way to scalability issues. Policies vary from company to company and political issues also caused the existence of central scheduler problematic [6].

Metascheduler deals with limitations of central scheduler to some extent [7]. In metascheduling, all clusters have their personal scheduler. DAG based tasks [8] are scheduled over the most capable cluster. Problem with global task scheduling arises when no cluster is capable of executing complex computational task. The drawback of metascheduler is that it cannot execute gigantic tasks using miniscule clusters and single computational resources, spread across various geographical domains. P2P technologies [9] are effective enough to act as decentralized grid scheduler. Decentralization makes our grid robust against grid node failures. Moreover, structure of P2P grid does not cause scalability issues and other bottlenecks. Further, complex problems are solved efficiently using genetic algorithm. P2P [10] grid also uses genetic algorithm to obtain good results [11]. Initially, to get results quickly, researchers schedule independent gigantic tasks [11] which are generated on single grid node over P2P grid resources [12]. Parallel execution of such tasks over various P2P resources produced results quickly. DAG based tasks [13] require extra precision in scheduling when scheduled over grid. Intertask dependencies makes it tough to schedule subtasks of DAG based task, as efficiently as independent tasks. Researchers in [14] have used genetic algorithm to schedule subtasks of DAG based task [15]. Authors of [14] have applied a genetic algorithm to find schedule for DAG based tasks in one go. The probability of finding nearly optimal results decreases as tasks of DAG are divided across various precedence levels.

Our approach says that we have to apply genetic algorithms to obtain the schedule for subtasks of one precedence level at one time. Also, if there is a single subtask at any precedence level, then we schedule subtask on P2P grid resource which gives results quickly. In this way, subtasks of DAG based task is scheduled over P2P grid resources from one precedence level to another. The probability of finding a nearly optimal schedule is higher with the approach adopted in this paper.

Rest of paper is organized as follows literature review is given in Section 2. Background of genetic algorithm and DAG based task scheduling using it is explained in Section 2. We have proposed fault tolerant precedence level based genetic scheduling algorithm for P2P Grid in Section 3. In Section 4, we have represented and discussed simulation results. Conclusion and future scope of work are discussed in Section 5. The symbols which are used throughout the paper are presented in the Abbreviations.

2. Literature Review of Decentralized Scheduling Techniques Using Genetic Algorithm to Schedule Tasks

Holland first explained genetic algorithm in 1975. In the last decade, genetic algorithms have been used by various researchers to schedule tasks over grid. Both independent and interdependent tasks were scheduled using genetic algorithms. Estimation of distribution algorithm (EDAs) is a new class of evolutionary algorithms. In EDAs, promising schedules are obtained by means of probabilistic model. EDAs give better schedules as compared to the evolutionary algorithms mentioned in [16]. In Section 2.1, we have described some remarkable decentralized scheduling techniques using genetic algorithm to schedule independent tasks. Section 2.2 highlights decentralized scheduling algorithm employing genetic algorithm to schedule subtasks of workflow.

2.1. Decentralized Scheduling Techniques Using Genetic Algorithm to Schedule Independent Tasks

One of the eminent papers which applied genetic algorithm to schedule independent tasks over heterogeneous resources in fully decentralized fashion is given in [17]. Scheduling applications using genetic algorithms (SAGA) were proposed in [17]. In the SAGA, computational nodes can connect and leave system dynamically. This system utilizes lookup services to work as decentralized scheduler. The SAGA emphasizes on splitting the task sets and heterogeneous resources into subparts. Moreover, algorithm is run on each subpart. In the SAGA, firstly scheduling request is put forward by user. By using grid monitoring service (MonALISA) [18], we obtain monitoring data and scheduling request. After this, near optimal schedule is obtained by using monitoring data and scheduling request. Execution services then execute this near optimal schedule. Discovery service is provided in the SAGA to handle failure and incorporate new computational resources. Schedule and task information of executed jobs are provided as feedback to the user. This algorithm decreases the number of generations required in a genetic algorithm to yield efficient schedule.

Decentralized grid scheduling is achievable using P2P technique and it was proved in [11]. In [11], after authorization and authentication checks, any grid node can issue a job submission query. Cyclone is recursively accessed to find nodes. is a parameter of the algorithm. Nodes which decrease optimization function give the first schedule. It is impossible to investigate all possible permutations of nodes out of total . Only in two cases ( is exceptionally petite or ), we can investigate all permutations. Thus, genetic algorithm is used for the selection process to obtain nearly optimal schedules. A limitation of this algorithm is that it can only schedule independent tasks like SAGA.

2.2. Decentralized Scheduling Techniques Using Genetic Algorithm to Schedule Subtasks of DAG Based Task

In grid environment, genetic algorithm [19] was used to schedule [20] DAG based task in [14]. We know that scheduling subtasks of DAG based task over a grid is an NP hard problem. Genetic algorithm and other stochastic search algorithms are utilized to obtain near optimal schedule for DAG based task’s scheduling on grid nodes.

DAG based workflow chosen by us to put into operation [14] is shown in Figure 1(a). DAG based task is divided into 10 subtasks. These subtasks are further subdivided into four precedence levels. Once all subtasks of previous precedence level have returned results to origin node, subtasks at next precedence level start executing in parallel. Subtasks present at the same precedence level are executed in parallel on the separate grid nodes. Virtual network topology followed to be simulated [14] by us is shown in Figure 1(b). DAG based task is generated at origin node (). We used genetic algorithm to schedule subtasks of over grid nodes , , , and .

Figure 2 represents how [14] used genetic algorithm to schedule subtasks of DAG based task . Initial population of schedules is produced in [14] by arbitrarily assigning every subtask to a grid node. Offspring for the next generation are chosen from an initial population using roulette wheel selection method. Authors have applied genetic operators on these shortlisted schedules to obtain the rest of population for given generation. Genetic operators used are crossover and mutation. When stagnation in population arises, then the probability of mutation increases. From population, the best DNA representing schedule for subtasks of is chosen.

We know subtasks of are divided into precedence levels. Subtasks at the same precedence level can be executed in parallel on different grid nodes. A prerequisite for subtasks at level to start execution is that subtasks at level have finished and returned results. This sequence is followed in DNA representation of the schedule. Subtask at the first level is the first subtask at the top of DNA. If more than one subtask is assigned to the node, then the subtask which comes first in DNA will be executed first. Figure 3 explains the sequence in which subtasks of assigned to same node will be executed. In Figure 3, subtasks at the same precedence level are having the same color. Subtasks of level 3 are represented by orange color. Level 3 subtasks will start executing once subtasks at level 2 have finished and delivered results. Level 2 tasks are represented by green boxes and will execute in parallel on different nodes. In case grid node leaves the grid, we have to reschedule all subtasks of again using [14].

In P2P grid, nodes can leave freely. Hence, our algorithm exploits precedence level approach to handle node failures.

3. Fault Tolerant Genetic Algorithm Based Decentralized Scheduling Technique for P2P Grid

In this paper, we have proposed a decentralized scheduling technique for P2P grid, which utilizes a precedence level based genetic algorithm (PLBGSA) to schedule subtasks of DAG based task . In DAG based task, subtasks have intertask dependencies. In addition to schedule subtasks over various grid nodes, we have to find out the associated computation and communication cost [21]. We store in all P2P nodes a list mentioned in [22]. This list is modified whenever scheduling happens. Accordingly, neighbors also modify their list . In [23], authors put forward the concept of workload (computing field) of subtask over any P2P grid resource. It is given in (1) as follows: where represents the total number of processing elements present in P2P grid node. MIPSPr the gives number of million instructions per second single processing element can process. is size in million instructions of waiting subtask in the task queue of length on grid node. Communication cost [22] is the time to send subtask from one node to another, explained by (2) as follows: In the above equation, is window size and is round trip time between nodes and . Size of subtask in Kb is represented by . Subtask will also depend upon time consumed to finish subtasks at a previous precedence level and to return results to the origin node. Previous algorithms, have used DNA containing details of all subtasks of DAG based task . Genetic algorithm was applied using initial population of randomly generated DNAs. Task shown in Figure 1(a) is divided into precedence levels. In our proposed approach, when any precedence levels contain only one subtask, we need not to apply genetic algorithm on that particular subtask. Instead, we calculate finish time of subtask on all available P2P grid nodes. Finally, we schedule single subtask on the node which gives fastest result. As shown in Figure 4, single subtask is scheduled without using genetic algorithm. This scheduling value for is stored in list . This value will be taken as prerequisite to schedule subtasks at the next level.

On the other hand, if precedence level of subtask contains more than one subtask, we use a genetic algorithm to find good schedule. As shown in Figure 4, at precedence level 2, five subtasks are present. Schedule to finish these 5 subtasks is represented by DNA. such DNAs are randomly generated for initial generation. The such generations are generated by applying genetic operators on shortlisted DNAs of previous generation. Crossover and mutation are genetic operators used in this paper. The roulette wheel selection technique is used to shortlist DNAs from all DNAs present at any generation. We select DNA from all these generations such that it finishes subtasks the fastest. We schedule using the best schedule among all generations. Values are stored accordingly in list . Again, we apply genetic algorithm for subtasks at level 3 and find good schedule. Scheduling is performed according to this good schedule and list will be updated. Again, there is a single task in level 4, just like in level 1. Hence, is scheduled on node giving results fastest. This way all subtasks of DAG based task are scheduled.

Algorithm for PLBGSA is shown in Figure 5. In this algorithm, first we arrange all subtasks in priority based task sequence . Assign level value 1. Choose all subtasks present at level from . If only one subtask is present at level , then single subtask is assigned to a node having minimum . is workload after subtask is assigned to node . Range of is 0 to , where represents the number of P2P grid nodes available for scheduling. If more than one subtask is present at level , we use genetic algorithm to schedule all subtasks on set . While applying genetic algorithm, first we generate an initial population of DNAs. To obtain single DNA we use and assign all to set represents a set of subtasks present at level from . Then we calculate for DNA. Finish time value of is made equal to . Further, generations are obtained by applying genetic operators on the previous generation. will be calculated by scheduling subtasks at level 2 one by one in sequence in which they are found in . Consider When we schedule subtask, workload of the node on which subtask is scheduled will also vary. This new workload will be as follows: Here, is the most efficient factor and will be the greatest of these three values. First value is the old workload on P2P grid node . Second is transport time to send a task from one node to another. is the third value which gives time when all subtasks at previous level will be finished and had returned results. An origin node where task is generated will use these values to make a scheduling decision. However, entities in list are changed when we have found the best schedule using genetic algorithm. We shortlist schedules from an initial population by applying roulette wheel selection method (). This way second generation’s first schedules will be obtained from predecessors. We choose two schedules , from these schedules and apply , genetic operators. Two new schedules will be obtained by this method. In this manner new schedules for second generation are obtained. is a mutation operator which will be applied more often if stagnation in schedules occurs. We shortlist from and schedule according to . Update list according to . Similarly, we calculate for all levels and update list accordingly for all levels. Finally, at level having value represents finish time for task . The schedule obtained using this algorithm is better than the algorithm presented in [14]. Our proposed algorithm is depicted in Table 1.

In Table 1, cost is the statement that takes steps to execute and step executes times. Hence, we find that, in the worst case, the running time of the above scheduling algorithm is which on simplifying gives The running time of the algorithm is the sum of running times for each statement executed. We can express the above equation in the form of for constants , and that again depends on statement costs ; it is thus a quadratic function of , that is, .

The concept of fault tolerance is also introduced in our algorithm. Fault tolerance [24] mechanism used in this approach is the modified version of fault tolerance [25, 26] mechanism of our previous work [22]. Two components present on all P2P grid nodes are notification generator and notification receiver in order to handle failure situation. These components either transfer or receive the three types of messages, the heartbeat message, the task completion message, and task failure message. If no message is received and periodic time expires, then an automatic task failure message is generated at the node. When failure message is generated then we reschedule subtasks at level . Further, levels beyond are again rescheduled, accordingly.

In this manner, we achieve two goals by applying genetic algorithm separately for each level instead of applying on all levels at once. Firstly, we obtain better schedule. Also, fault tolerance approach will be applicable because of the unreliable nature of P2P grid nodes. Simulation results and discussion are given in the next section in support of PLBGSA algorithm.

4. Simulation Results and Discussion

Virtual network topology followed in this paper is shown in Figure 1(b). node is the origin node where DAG based task is generated. Nodes set represents P2P grid nodes and the number of cores in P2P grid nodes is , respectively. Window size for each node is . Round trip time for P2P grid nodes is . Computation capacity in million instructions per second for each P2P grid node is .

DAG based task consists of 10 subtasks and sizes of each subtask in million instructions are , respectively. Size in Kb for each subtask is . Origin node of all subtasks is the same (). Subtask level is also visible in Figure 1(a).

Now, when random scheduling is used to schedule subtasks of , finish time of comes at 17.551 seconds as shown in Figure 6. Then we use genetic algorithm for scheduling DAG based task , as mentioned in [14]. Results are obtained for task in 12.019 seconds. Detailed scheduling of all subtasks is shown in Figure 7. When we use precedence level based genetic algorithm for decentralized scheduling of task on P2P grid nodes, we get results in 8.973 seconds as shown in Figure 8.

After executing 10 times the proposed algorithm and the algorithm proposed in [14], schedules obtained by our techniques always come better as shown in Figure 9. Here, the number of generations was equal. In Figure 10, the algorithm of [14] is run with , and generations. However, with the increase in generation, schedules obtained in 10 runs have not much reduced in size. In Figure 11, comparison of schedules obtained by running 10 times algorithm proposed in [14] and our algorithm is shown. Reference [14] is having number of generations and our algorithm is run only with generations. However, schedules of previous algorithm were not as good as compared to our proposed algorithm. This fact is demonstrated in Figure 11.

Figure 12 shows that average finish time of our algorithm with only generations is much better than the average finish time of [14], even when the number of generations is increased to . Also, waiting time for subtasks at all levels is decreased with our algorithm as shown in Figure 13.

Node utilization for random scheduling is shown in Figure 14. When we schedule subtasks using old GA proposed in [14], node 2 is sitting idle throughout all precedence levels. This is shown in Figure 15. Nodes are utilized more uniformly in PLBGSA as shown in Figure 16.

If distinct P2P grid nodes are to be arranged along subtasks of task , where repetition of nodes is allowed, then the total number of ways of doing this scheduling is ; here . To find out the best schedule among all possible schedules is a very exhaustive task. Hence, using a genetic algorithm we generate number of schedules out of possible schedules. Finally, we select the fastest schedule among these schedules. Now the probability of finding the best schedules in randomly selected schedules is as follows: On solving the right-hand side of (7), we get Our approach first schedules tasks at level 2 then at level 3 and henceforth up to th level (second last level) of DAG based task . Subtasks at level 2 are . Hence the total number of ways of doing scheduling is (here ). Now the probability of finding best schedules in randomly selected schedules is as follows: Level 3 contains subtasks; hence, the total number of ways of doing scheduling is (here ). Consider Similarly, at level , Now, since , therefore.

From (8)–(11), Similarly, Hence, the probability of getting better schedule using our approach is higher. Moreover, we store the results after scheduling all subtasks at any precedence level; we can incorporate fault tolerance in our approach. If we schedule using [14] and if some node fails, then again we have to schedule all subtasks present at all the levels shown in Figure 17. However, PLBGSA assigns genetic algorithm for subtasks of all levels separately; hence, we reschedule subtasks at a level where node failure happened and subtasks, beyond that level. This way we obtain results much faster as shown in Figure 18.

5. Conclusion and Future Scope of Work

We have applied genetic algorithm in every precedence level to schedule subtasks on P2P grid nodes. Moreover, PLBGSA is better and efficient than the algorithm proposed by Pop et al. [14]. Probability of finding good schedule is higher than the previous works. P2P grid resources are utilized more uniformly with PLBGSA. Further, fault detection and recovery mechanism is proposed in PLBGSA. This fault tolerance mechanism is yielding good results. We obtain near optimal schedule with a reduced number of generations in PLBGSA. In the future scope of work, we can apply other optimization heuristics using precedence level based scheduling for P2P grid. Also, we will incorporate task duplication technique before applying genetic scheduling at each precedence level in our future algorithm.

Abbreviations

PLBGSA:Fault tolerant precedence level based genetic scheduling algorithm for P2P grid
:Single subtask present on level
:Workload of task on node
:Number of P2P grid nodes available
:Random number generator
:Time required to finish all subtasks of level
:Number of subtasks present at level
:Set of subtasks present at level
:Roulette wheel selection method
:Range of schedules
:Individual schedule to finish set of subtasks at level
:th schedule to finish set of subtasks at level
and :Pair of schedules shortlisted for genetic operations from a defined range in previous schedules
:Schedule having the smallest from all schedules for level
:Crossover genetic operator
:Mutation genetic operator
:Highest level present in DAG based task
:Total number of generations
:Generation number
:Size of generation
:Number of subtasks at level
:Node failure flag
:Node to which subtask is assigned in any schedule
:Load on node of subtask at level
:Workload after task at level is added to any node
:Old workload on node
:Time required in sending subtask at level from node to node
:Factor having maximum magnitude among 3 factors
:Time at which results will be returned for all subtasks at previous level
:Number of DNAs selected from previous generation.