Innovative Applications of O.R.
Heuristics for the stochastic dynamic task-resource allocation problem with retry opportunities

https://doi.org/10.1016/j.ejor.2017.09.006Get rights and content

Highlights

  • A stochastic multi-period task-resource allocation problem is studied.

  • A constructive heuristic is introduced.

  • Forward dynamic programming and evolutionary algorithms are proposed.

  • Computational performances of algorithms are illustrated.

Abstract

This paper deals with a stochastic multi-period task-resource allocation problem. A team of agents with a set of resources is to be deployed on a multi-period mission with the goal to successfully complete as many tasks as possible. The success probability of an agent assigned to a task depends on the resources available to the agent. Unsuccessful tasks can be tried again at later periods. While the problem can in principle be solved by dynamic programming, in practice this is computationally prohibitive except for tiny problem sizes. To be able to tackle also larger problems, we propose a construction heuristic that assigns agents and resources to tasks sequentially, based on the estimated marginal utility. Based on this heuristic, we furthermore propose various Approximate Dynamic Programming approaches and an Evolutionary Algorithm. All suggested approaches are empirically compared on a number of randomly generated problem instances. We show that the construction heuristic is very fast and provides good results. For even better results, at the expense of longer computational time, Approximate Dynamic Programming seems a suitable alternative.

Introduction

Resource management is one of the fundamental problems in operations research and involves dynamically assigning tasks and allocating discrete resources over time. The task assignment problem is about deciding which agent should perform which task and at what time, whereas the resource allocation problem determines the level and type of resources to be used for attempting each task. The resources are consumed for the accomplishment of the tasks and include for instance materials, energy, ammunition, and man hours.

The task assignment problem (e.g. Alighanbari & How, 2008) and resource allocation problem (e.g. Angalakudati, Balwani, Calzada, Chatterjee, Perakis, Raad, Uichanco, 2014, Ernst, Jiang, Krishnamoorthy, 2006, Nwozo, Nkeki, 2012, Tharumarajah, 2001, Zhen, 2015) have been extensively studied independently in the literature and have various applications in production planning, freight transportation, job scheduling and financial planning. In particular, dynamic multi-resource allocation models have been developed for infrastructure management (Pekka & Ahti, 2009), project scheduling (Wiesemann, Kuhn & Rustem, 2012), service capacity management (Liu & Truong, 2013) and healthcare (Chao, Liu & Zheng, 2003).

There are also real life applications in team management and multi-agent planning in military missions such as reconnaissance and surveillance of unmanned vehicles; for instance, see (Gulpinar et al., 2010, Chandler, Pachter, Rasmussen, Schumacher, 2002, Hong, Gordon, 2015, Koenig, Tovey, Zheng, Sungur, 2007, Nygard, Chandler, Pachter, 2001, Samuel, Guikema, 2012, Tovey, Lagoudakis, Jain, Koenig, 2005, Zheng, Koenig, 2009). For these cases, task assignment and resource allocation decisions cannot be made in isolation for a successful mission planning and effective team deployment (de Weerdt & Clement, 2009). In a multi-agent environment, the coordination of tasks and resources results in real time problems that cannot be solved in polynomial time (Bellingham, Tillerson, Richards & How, 2003). As Haslum and Geffner (2014) pointed out, the optimal solution is not tractable for the scheduling of tasks using both renewable and consumable resources. In general, the dynamic and discrete nature of task assignment and resource allocation decisions increase the problem complexity; thus, the underlying optimization problems are NP-hard; for instance, see Farias and Roy (2006). Moreover, for real life problems, uncertainty due to noisy data and unexpected events needs to be taken into account during the modelling stage.

In some problems, the accomplishment of tasks cannot be assumed for sure since agents may be unsuccessful. Ahuja, Kumar, Jha and Orlin (2007) assumed that multiple weapons can be assigned to a single target and considered a static environment as all decisions are made at the beginning. They proposed exact branch-and-bound algorithms for the network flow counterpart of the problem and also used a neighborhood search based heuristic. Alighanbari and How (2008) extended the weapon task assignment problem where the targets can be shot down by the agents. They formulated the problem using a Markov decision process and introduced one-step lookahead heuristics to compare with the performance of mixed integer linear programming. Chen, Xin, Peng, Dou and Zhang (2009) introduced an asset-based dynamic weapon-target assignment optimization model subject to capability, strategy, resource and engagement feasibility constraints. In order to solve this problem, evolutionary algorithms were developed. Wacholder (1989) applied neural networks to solve a weapon-task assignment problem efficiently within a static setting. Deng, Yu and Wang (2013) also applied genetic algorithms to solve an integer programming formulation of a static task assignment problem where heterogeneous unmanned aerial vehicles can cooperate to accomplish multiple tasks to minimize total mission time. Recently, Davis, Robbins and Lunday (2017) considered an asset-based defensive variant of the dynamic weapon-task assignment problem. They applied an approximate dynamic programming approach to find an optimal fire control policy for a defensive missile system.

Stochastic programming approaches have been developed to take into account uncertain parameters. For instance, Murphey (2000) developed a two-stage nonlinear integer stochastic programming formulation of the dynamic weapon assignment problem where the numbers and locations of targets are unknown a priori. Although the model allows arrivals of new tasks over time, it does not take into account observations of past allocation outcomes. A cutting plane optimization method was proposed to solve the underlying integer program. Castanon and Wohletz (2002) also studied a dynamic task-resource allocation problem where unsuccessful tasks are assigned further resources over two stages. The outcome of the first stage resource allocation is observed before making the second stage allocations to multiple tasks. They formulated the problem as a two-stage stochastic control problem and introduced an approximation for admissible control space to be used in a model-predictive control algorithm. Ahner and Parson (2015) considered a dynamic programming formulation of the two-stage stochastic programming problem. They assumed that the number of target arrivals in the first stage is known, but in the second stage, it is assumed to be stochastic and following a known distribution. They solved the approximate weapon-task assignment problem using an adaptive dynamic programming method.

It is well known that dynamic problems with large state spaces and action sets suffer from the curse of dimensionality. Therefore, different tractable approaches have been proposed and efficient heuristics and approximation algorithms have been developed to solve the task assignment and resource allocation problem (e.g. Calinescu, Chakrabarti, Karloff and Rabani, 2002 and Bertsimas, Gupta & Lullic, 2014). Among those approaches, the simulation optimization and approximate dynamic programming methods are worthwhile to mention. In particular, approximate dynamic programming based algorithms are introduced for the dynamic task assignment problem with various applications in transportation (Powell, 1996, Spivey, Powell, 2004, Spivey, Powell, 2003) and dynamic resource allocation (Farias, Roy, 2006, Powell, Shapiro, Simão, 2002, Powell, Topaloglu, 2006). Powell et al. (2002) applied an adaptive dynamic programming algorithm to the resource allocation problem. A constructive rule based heuristic (Xin, Chen, Peng, Dou & Zhang, 2011) and ant colony optimization (Pendharkar, 2015) are other models applied to dynamic task assignment and resource allocation problems.

In this paper, we consider a stochastic multi-period task-resource allocation problem where a team of agents and a pool of different kinds of resources need to be deployed on a given set of tasks. The underlying task-resource allocation problem is formulated using a Markov decision process. At the beginning of each period, based on the current system state, the agents are allocated some of the resources and assigned to specific tasks. An agent’s success probability for a task depends on the resources employed. The overall goal is to maximize the utility of the successfully completed tasks by the end of the planning horizon. Our contribution in this paper is threefold;

  • First, we model the joint task-resource allocation problem by taking into account opportunities to retry the same task in the future. If an agent has not been successful in completing the task, this task may be re-tried at a later period. In this way, the team of agents may increase the overall performance and utilize available resources efficiently over time. To the best of our knowledge, reallocation of tasks and resources during the planning horizon has not been considered in previous resource management applications developed in the literature.

  • Secondly, we introduce a new constructive heuristic that assigns agents and resources to tasks sequentially using the estimated marginal utility in view of retry option. Based on this heuristic, we furthermore propose forward (approximate) dynamic programming algorithms and an evolutionary algorithm. We introduce various value function approximations based on single and multiple features.

  • Thirdly, we perform computational experiments to evaluate the performance of all approaches. All suggested approaches are empirically compared on a number of randomly generated problem instances. The numerical results show that the construction heuristic is very fast and provides good results. For even better results, at the expense of longer computational time, approximate dynamic programming seems a suitable alternative.

The rest of the paper is organized as follows. Section 2 presents the formulation of the underlying task-resource allocation problem. In Section 3, we introduce the heuristic approach and explain basic steps using an illustrative example. Section 4 briefly describes the simulation based approaches in view of different value function approximations. The computational experiments and results are reported in Section 5. Section 6 concludes the paper and points out some ideas for future work.

Section snippets

Problem statement

In this section, we first describe the dynamic stochastic task-resource allocation problem and then provide a dynamic programming formulation via a Markov Decision Process (MDP). The MDP model assumes that the decision-making process has the Markovian property and applies a dynamic programming principle to solve the underlying planning problem.

The dynamic stochastic task-resource allocation problem consists of a team of agents to be deployed to accomplish a given set of tasks using the

A heuristic for task-resource allocation with retry option

In this section, we will develop a construction heuristic for the dynamic stochastic task-resource allocation problem. We’ll start with a simple heuristic that ignores the possibility that agents may fail and that a task may be attempted again. We then extend the heuristic to take into account the retry option. Note that given all agents are homogeneous, the heuristic does not specify which agent is assigned a particular task, but just ensures that the number of agents used in each period is

Simulation-based dynamic programming approach

In order to solve the dynamic task-resource allocation problem, we also consider simulation based stochastic dynamic programming methods (namely forward dynamic programming (FDP) and approximate dynamic programming (ADP)) as alternative approaches to the heuristic introduced in Section 3. In this section, we provide a brief introduction to FDP and ADP with various value function approximation methods; the interested reader is referred to Powell (2011) and Topaloglu and Powell (2005) for further

Computational experiments

In this section, we first describe the design and data structure used for numerical experiments and then present the computational results of different approaches studied for the dynamic task-resource allocation problem.

Concluding remarks

In this paper, we investigated the dynamic task-resource allocation problem with retry opportunities for failed attempts. This problem has various real world applications, including in military where one has to allocate troops and weapons to combat missions over time, or repair service, where one has to allocate engineers and equipment to repair tasks. We proposed exact and approximate approaches for solving this problem, including novel heuristic algorithms that attempt to take into account

References (42)

  • D.A. Castanon et al.

    Model predictive control for dynamic unreliable resource allocation

    Proceedings of the 41st IEEE conference on decision and control

    (2002)
  • P. Chandler et al.

    Multiple task assignment for a UAV team

    Proceedings of the AIAA guidance, navigation, and control conference

    (2002)
  • X. Chao et al.

    Resource allocation in multisite service systems with intersite customer flows

    Management Science

    (2003)
  • J. Chen et al.

    Evolutionary decision-making for the dynamic weapon-target assignment problem

    Science in China Series F: Information Sciences

    (2009)
  • A.E. Eiben et al.

    Introduction to Evolutionary Computing

    (2007)
  • A. Ernst et al.

    Exact solutions to task allocation problems

    Management Science

    (2006)
  • N. Gulpinar et al.

    Robust team decision making under uncertainty

    International Journal of. Applied Decision Sciences

    (2010)
  • P. Haslum et al.

    Heuristic planning with time and resources

    Proceedings of the European conference on planning

    (2014)
  • S.A. Hong et al.

    Decomposition-based optimal market-based planning for multi-agent systems with shared resources

    Proceedings of the workshop and conference proceedings

    (2015)
  • S. Koenig et al.

    Sequential bundle-bid single-sale auction algorithms for decentralized control

    Proceedings of the international joint conference on artificial intelligence

    (2007)
  • N. Liu et al.

    Multi-resource allocation scheduling in dynamic environments

    Manufacturing and Service Operations Management

    (2013)
  • Cited by (24)

    • A policy gradient approach to solving dynamic assignment problem for on-site service delivery

      2023, Transportation Research Part E: Logistics and Transportation Review
    • Stochastic optimization for vaccine and testing kit allocation for the COVID-19 pandemic

      2023, European Journal of Operational Research
      Citation Excerpt :

      There are various other stochastic optimization approaches for resource allocation problems throughout the literature. Gülpınar, Çanakoğlu, & Branke (2018) proposes an approximate dynamic programming algorithm for assigning a limited number of resources to as many tasks as possible. Creemers (2019) solves a preemptive stochastic resource constrained scheduling problem by restructuring the state space to efficiently solve a stochastic dynamic program via lookup tables.

    • Two-stage hybrid heuristic search algorithm for novel weapon target assignment problems

      2021, Computers and Industrial Engineering
      Citation Excerpt :

      To reduce the complexity of the problem, many variants of the WTA problem have been studied. Gulpinar (2018) assumed that all weapons are identical and designed an MMR algorithm to obtain the optimal solution. Bogdanowicz (2009) considered the scenario that the number of weapons equals the number of targets and each target could be assigned only one weapon.

    • An approximate dynamic programming approach for comparing firing policies in a networked air defense environment

      2020, Computers and Operations Research
      Citation Excerpt :

      Their problem instance of interest is small enough to find an exact solution to the MDP model, and the authors also investigate the quality of ADP approaches. Gulpinar et al. (2018) formulate a stochastic dynamic task-resource allocation problem with retry opportunities, which generalizes many variants of the dynamic WTAP. The authors develop and test a constructive heuristic that sequentially assigns resources (e.g., interceptors) to tasks (e.g., incoming missiles).

    • The Weapon-Target Assignment Problem

      2019, Computers and Operations Research
      Citation Excerpt :

      Often, WTA works are cited for their modeling or solution techniques, as they are applicable in many assignment problems with quantifiable rewards or costs and limited resources. Gülpınar et al. (2018) framed their model and solution technique for a dynamic resource allocation problem on much of the same literature that is outlined in Sections 2–4 of this survey. Çetin and Esen (2006) model and solve a media allocation problem with an objective function which, if Vj is the audience type value, pij is the probability that audience j views advertisement i, and decision variable xij is the number of advertisements of type i to assign to audience j, is the formulation S1.

    View all citing articles on Scopus
    View full text