Heuristics for the stochastic dynamic task-resource allocation problem with retry opportunities

doi:10.1016/j.ejor.2017.09.006

European Journal of Operational Research

Volume 266, Issue 1, 1 April 2018, Pages 291-303

https://doi.org/10.1016/j.ejor.2017.09.006 Get rights and content

Highlights

•
A stochastic multi-period task-resource allocation problem is studied.
•
A constructive heuristic is introduced.
•
Forward dynamic programming and evolutionary algorithms are proposed.
•
Computational performances of algorithms are illustrated.

Abstract

This paper deals with a stochastic multi-period task-resource allocation problem. A team of agents with a set of resources is to be deployed on a multi-period mission with the goal to successfully complete as many tasks as possible. The success probability of an agent assigned to a task depends on the resources available to the agent. Unsuccessful tasks can be tried again at later periods. While the problem can in principle be solved by dynamic programming, in practice this is computationally prohibitive except for tiny problem sizes. To be able to tackle also larger problems, we propose a construction heuristic that assigns agents and resources to tasks sequentially, based on the estimated marginal utility. Based on this heuristic, we furthermore propose various Approximate Dynamic Programming approaches and an Evolutionary Algorithm. All suggested approaches are empirically compared on a number of randomly generated problem instances. We show that the construction heuristic is very fast and provides good results. For even better results, at the expense of longer computational time, Approximate Dynamic Programming seems a suitable alternative.

Introduction

Resource management is one of the fundamental problems in operations research and involves dynamically assigning tasks and allocating discrete resources over time. The task assignment problem is about deciding which agent should perform which task and at what time, whereas the resource allocation problem determines the level and type of resources to be used for attempting each task. The resources are consumed for the accomplishment of the tasks and include for instance materials, energy, ammunition, and man hours.

The task assignment problem (e.g. Alighanbari & How, 2008) and resource allocation problem (e.g. Angalakudati, Balwani, Calzada, Chatterjee, Perakis, Raad, Uichanco, 2014, Ernst, Jiang, Krishnamoorthy, 2006, Nwozo, Nkeki, 2012, Tharumarajah, 2001, Zhen, 2015) have been extensively studied independently in the literature and have various applications in production planning, freight transportation, job scheduling and financial planning. In particular, dynamic multi-resource allocation models have been developed for infrastructure management (Pekka & Ahti, 2009), project scheduling (Wiesemann, Kuhn & Rustem, 2012), service capacity management (Liu & Truong, 2013) and healthcare (Chao, Liu & Zheng, 2003).

There are also real life applications in team management and multi-agent planning in military missions such as reconnaissance and surveillance of unmanned vehicles; for instance, see (Gulpinar et al., 2010, Chandler, Pachter, Rasmussen, Schumacher, 2002, Hong, Gordon, 2015, Koenig, Tovey, Zheng, Sungur, 2007, Nygard, Chandler, Pachter, 2001, Samuel, Guikema, 2012, Tovey, Lagoudakis, Jain, Koenig, 2005, Zheng, Koenig, 2009). For these cases, task assignment and resource allocation decisions cannot be made in isolation for a successful mission planning and effective team deployment (de Weerdt & Clement, 2009). In a multi-agent environment, the coordination of tasks and resources results in real time problems that cannot be solved in polynomial time (Bellingham, Tillerson, Richards & How, 2003). As Haslum and Geffner (2014) pointed out, the optimal solution is not tractable for the scheduling of tasks using both renewable and consumable resources. In general, the dynamic and discrete nature of task assignment and resource allocation decisions increase the problem complexity; thus, the underlying optimization problems are NP-hard; for instance, see Farias and Roy (2006). Moreover, for real life problems, uncertainty due to noisy data and unexpected events needs to be taken into account during the modelling stage.

In some problems, the accomplishment of tasks cannot be assumed for sure since agents may be unsuccessful. Ahuja, Kumar, Jha and Orlin (2007) assumed that multiple weapons can be assigned to a single target and considered a static environment as all decisions are made at the beginning. They proposed exact branch-and-bound algorithms for the network flow counterpart of the problem and also used a neighborhood search based heuristic. Alighanbari and How (2008) extended the weapon task assignment problem where the targets can be shot down by the agents. They formulated the problem using a Markov decision process and introduced one-step lookahead heuristics to compare with the performance of mixed integer linear programming. Chen, Xin, Peng, Dou and Zhang (2009) introduced an asset-based dynamic weapon-target assignment optimization model subject to capability, strategy, resource and engagement feasibility constraints. In order to solve this problem, evolutionary algorithms were developed. Wacholder (1989) applied neural networks to solve a weapon-task assignment problem efficiently within a static setting. Deng, Yu and Wang (2013) also applied genetic algorithms to solve an integer programming formulation of a static task assignment problem where heterogeneous unmanned aerial vehicles can cooperate to accomplish multiple tasks to minimize total mission time. Recently, Davis, Robbins and Lunday (2017) considered an asset-based defensive variant of the dynamic weapon-task assignment problem. They applied an approximate dynamic programming approach to find an optimal fire control policy for a defensive missile system.

Stochastic programming approaches have been developed to take into account uncertain parameters. For instance, Murphey (2000) developed a two-stage nonlinear integer stochastic programming formulation of the dynamic weapon assignment problem where the numbers and locations of targets are unknown a priori. Although the model allows arrivals of new tasks over time, it does not take into account observations of past allocation outcomes. A cutting plane optimization method was proposed to solve the underlying integer program. Castanon and Wohletz (2002) also studied a dynamic task-resource allocation problem where unsuccessful tasks are assigned further resources over two stages. The outcome of the first stage resource allocation is observed before making the second stage allocations to multiple tasks. They formulated the problem as a two-stage stochastic control problem and introduced an approximation for admissible control space to be used in a model-predictive control algorithm. Ahner and Parson (2015) considered a dynamic programming formulation of the two-stage stochastic programming problem. They assumed that the number of target arrivals in the first stage is known, but in the second stage, it is assumed to be stochastic and following a known distribution. They solved the approximate weapon-task assignment problem using an adaptive dynamic programming method.

It is well known that dynamic problems with large state spaces and action sets suffer from the curse of dimensionality. Therefore, different tractable approaches have been proposed and efficient heuristics and approximation algorithms have been developed to solve the task assignment and resource allocation problem (e.g. Calinescu, Chakrabarti, Karloff and Rabani, 2002 and Bertsimas, Gupta & Lullic, 2014). Among those approaches, the simulation optimization and approximate dynamic programming methods are worthwhile to mention. In particular, approximate dynamic programming based algorithms are introduced for the dynamic task assignment problem with various applications in transportation (Powell, 1996, Spivey, Powell, 2004, Spivey, Powell, 2003) and dynamic resource allocation (Farias, Roy, 2006, Powell, Shapiro, Simão, 2002, Powell, Topaloglu, 2006). Powell et al. (2002) applied an adaptive dynamic programming algorithm to the resource allocation problem. A constructive rule based heuristic (Xin, Chen, Peng, Dou & Zhang, 2011) and ant colony optimization (Pendharkar, 2015) are other models applied to dynamic task assignment and resource allocation problems.

In this paper, we consider a stochastic multi-period task-resource allocation problem where a team of agents and a pool of different kinds of resources need to be deployed on a given set of tasks. The underlying task-resource allocation problem is formulated using a Markov decision process. At the beginning of each period, based on the current system state, the agents are allocated some of the resources and assigned to specific tasks. An agent’s success probability for a task depends on the resources employed. The overall goal is to maximize the utility of the successfully completed tasks by the end of the planning horizon. Our contribution in this paper is threefold;

•
First, we model the joint task-resource allocation problem by taking into account opportunities to retry the same task in the future. If an agent has not been successful in completing the task, this task may be re-tried at a later period. In this way, the team of agents may increase the overall performance and utilize available resources efficiently over time. To the best of our knowledge, reallocation of tasks and resources during the planning horizon has not been considered in previous resource management applications developed in the literature.
•
Secondly, we introduce a new constructive heuristic that assigns agents and resources to tasks sequentially using the estimated marginal utility in view of retry option. Based on this heuristic, we furthermore propose forward (approximate) dynamic programming algorithms and an evolutionary algorithm. We introduce various value function approximations based on single and multiple features.
•
Thirdly, we perform computational experiments to evaluate the performance of all approaches. All suggested approaches are empirically compared on a number of randomly generated problem instances. The numerical results show that the construction heuristic is very fast and provides good results. For even better results, at the expense of longer computational time, approximate dynamic programming seems a suitable alternative.

The rest of the paper is organized as follows. Section 2 presents the formulation of the underlying task-resource allocation problem. In Section 3, we introduce the heuristic approach and explain basic steps using an illustrative example. Section 4 briefly describes the simulation based approaches in view of different value function approximations. The computational experiments and results are reported in Section 5. Section 6 concludes the paper and points out some ideas for future work.

Section snippets

Problem statement

In this section, we first describe the dynamic stochastic task-resource allocation problem and then provide a dynamic programming formulation via a Markov Decision Process (MDP). The MDP model assumes that the decision-making process has the Markovian property and applies a dynamic programming principle to solve the underlying planning problem.

The dynamic stochastic task-resource allocation problem consists of a team of agents to be deployed to accomplish a given set of tasks using the

A heuristic for task-resource allocation with retry option

In this section, we will develop a construction heuristic for the dynamic stochastic task-resource allocation problem. We’ll start with a simple heuristic that ignores the possibility that agents may fail and that a task may be attempted again. We then extend the heuristic to take into account the retry option. Note that given all agents are homogeneous, the heuristic does not specify which agent is assigned a particular task, but just ensures that the number of agents used in each period is

Simulation-based dynamic programming approach

In order to solve the dynamic task-resource allocation problem, we also consider simulation based stochastic dynamic programming methods (namely forward dynamic programming (FDP) and approximate dynamic programming (ADP)) as alternative approaches to the heuristic introduced in Section 3. In this section, we provide a brief introduction to FDP and ADP with various value function approximation methods; the interested reader is referred to Powell (2011) and Topaloglu and Powell (2005) for further

Computational experiments

In this section, we first describe the design and data structure used for numerical experiments and then present the computational results of different approaches studied for the dynamic task-resource allocation problem.

Concluding remarks

In this paper, we investigated the dynamic task-resource allocation problem with retry opportunities for failed attempts. This problem has various real world applications, including in military where one has to allocate troops and weapons to combat missions over time, or repair service, where one has to allocate engineers and equipment to repair tasks. We proposed exact and approximate approaches for solving this problem, including novel heuristic algorithms that attempt to take into account

References (42)

D. Bertsimas et al.
Dynamic resource allocation: a flexible and tractable modeling framework
European Journal of Operational Research
(2014)
M.T. Davis et al.
Approximate dynamic programming for missile defense interceptor fire control
European Journal of Operational Research
(2017)
Q. Deng et al.
Cooperative task assignment of multiple heterogeneous unmanned aerial vehicles using a modified genetic algorithm with multi-type genes
Chinese Journal of Aeronautics
(2013)
V. Farias et al.
Approximation algorithms for dynamic resource allocation
Operations Research Letters
(2006)
D.K. Ahner et al.
Optimal multi-stage allocation of weapons to targets using adaptive dynamic programming
Optimization Letters
(2015)
R.K. Ahuja et al.
Exact and heuristic algorithms for the weapon-target assignment problem
Operations Research
(2007)
M. Alighanbari et al.
A robust approach to the UAV task assignment problem
International Journal of Robust and Nonlinear Control
(2008)
M. Angalakudati et al.
Business analytics for flexible resource allocation under random emergencies
Management Science
(2014)
Bellingham, J., Tillerson, M., Richards, A., How, J. P., (2003). Multi-task allocation and path planning for...
Calinescu, G., Chakrabarti, A., Karloff, H., Rabani, Y., (2002). Improved approximation algorithms for resource...

D.A. Castanon et al.

Model predictive control for dynamic unreliable resource allocation

Proceedings of the 41st IEEE conference on decision and control

(2002)

P. Chandler et al.

Multiple task assignment for a UAV team

Proceedings of the AIAA guidance, navigation, and control conference

(2002)

X. Chao et al.

Resource allocation in multisite service systems with intersite customer flows

Management Science

(2003)

J. Chen et al.

Evolutionary decision-making for the dynamic weapon-target assignment problem

Science in China Series F: Information Sciences

(2009)

A.E. Eiben et al.

Introduction to Evolutionary Computing

(2007)

A. Ernst et al.

Exact solutions to task allocation problems

Management Science

(2006)

N. Gulpinar et al.

Robust team decision making under uncertainty

International Journal of. Applied Decision Sciences

(2010)

P. Haslum et al.

Heuristic planning with time and resources

Proceedings of the European conference on planning

(2014)

S.A. Hong et al.

Decomposition-based optimal market-based planning for multi-agent systems with shared resources

Proceedings of the workshop and conference proceedings

(2015)

S. Koenig et al.

Sequential bundle-bid single-sale auction algorithms for decentralized control

Proceedings of the international joint conference on artificial intelligence

(2007)

N. Liu et al.

Multi-resource allocation scheduling in dynamic environments

Manufacturing and Service Operations Management

(2013)

Cited by (24)

A policy gradient approach to solving dynamic assignment problem for on-site service delivery
2023, Transportation Research Part E: Logistics and Transportation Review
The paper studies the resource allocation problem for delivering on-site services in urban areas. Requests for services are received spontaneously, with deliveries to be assigned dynamically. Real-life examples of such applications include the dispatch of traffic officers to scenes of accidents and the deployment of mechanics to sites of maintenance works. The dynamic assignment problem is to be solved via a policy gradient approach that dynamically assigns workers to different locations so that each customer involved would experience a minimum delay. Our solution framework adopts the transformer architecture with layers of inter-task and inter-agent communications as the approximator. This approximator is trained with the vanilla policy gradient algorithm. To improve computational effectiveness, we introduce an option of withholding an assignment, where workers may not be assigned at a decision point even if a service request is received, to enhance the flexibility of actions. Extensive computational experiments with a varying number of orders, order frequencies, and spatial sparsity are conducted. Our proposed method is shown to outperform other benchmarking methods, including the genetic algorithm and other online heuristics, in terms of stability of effectiveness, computational efficiency, and solution quality. Our experimental results suggest that the proposed method would have a reduced advantage over other benchmarking algorithms if the on-site service time is long.
Stochastic optimization for vaccine and testing kit allocation for the COVID-19 pandemic
2023, European Journal of Operational Research
Citation Excerpt :
There are various other stochastic optimization approaches for resource allocation problems throughout the literature. Gülpınar, Çanakoğlu, & Branke (2018) proposes an approximate dynamic programming algorithm for assigning a limited number of resources to as many tasks as possible. Creemers (2019) solves a preemptive stochastic resource constrained scheduling problem by restructuring the state space to efficiently solve a stochastic dynamic program via lookup tables.
We present a formal mathematical modeling framework for a multi-agent sequential decision problem during an epidemic. The problem is formulated as a collaboration between a vaccination agent and learning agent to allocate stockpiles of vaccines and tests to a set of zones under various types of uncertainty. The model is able to capture passive information processes and maintain beliefs over the uncertain state of the world. We designed a parameterized direct lookahead approximation which is robust and scalable under different scenarios, resource scarcity, and beliefs about the environment. We design a test allocation policy designed to capture the value of information and demonstrate that it outperforms other learning policies when there is an extreme shortage of resources (information is scarce). We simulate the model with two scenarios including a resource allocation problem to each state in the United States and another for the nursing homes in Nevada. The US example demonstrates the scalability of the model and the nursing home example demonstrates the robustness under extreme resource shortages.
Two-stage hybrid heuristic search algorithm for novel weapon target assignment problems
2021, Computers and Industrial Engineering
Citation Excerpt :
To reduce the complexity of the problem, many variants of the WTA problem have been studied. Gulpinar (2018) assumed that all weapons are identical and designed an MMR algorithm to obtain the optimal solution. Bogdanowicz (2009) considered the scenario that the number of weapons equals the number of targets and each target could be assigned only one weapon.
The objective of the weapon target assignment (WTA) problem is to maximize the total damage of targets or minimize the total consumption of weapons. However, the existing studies of the WTA problem did not consider the uncertainty of weapon adequacy in practical battlefield scenarios. In this paper, we study a novel WTA problem, called the WTA problem with uncertainty (WTAU), and the uncertainty refers to the uncertainty of weapon adequacy. The objective of the WTAU is to find a WTA scheme to maximize the total value of destroyed targets by using as few weapons as possible in situations where the adequacy of weapons is uncertain. To address the uncertainty of weapons adequacy, we formulate the WTAU problem as a nonlinear integer programming model. To improve the computing efficiency, we design a two-stage hybrid heuristic search (TSHHS) algorithm based on prior WTAU knowledge. In the first stage of the TSHHS algorithm, a logarithmic transformation method is employed to obtain the optimal target destruction (TD) scheme and an initial WTA scheme. In the second stage, a constructive heuristic method based on the maximum marginal return (MMRBH) and a simple optimization (SimpleO) method are developed to obtain a promising WTA scheme. Moreover, we perform extensive experiments to analyze the performance of the proposed method. The results of numerical experiments show that the proposed method can provide a high-quality WTA scheme with a short computing time in scenarios involving adequate or inadequate weapons.
An intelligent electric vehicle charging system for new energy companies based on consortium blockchain
2020, Journal of Cleaner Production
With the concerns of environment protection, electric vehicle (EV) is regarded as a promising transportation tool for green cities project. Since the amount of EV is rising shapely, the EV charging demands is also rapidly generated. However, seeking suitable charging facilities is not easy for EV users, new energy companies run charging station separately for self-interests, and charging pile information is not transparent for drivers. This dilemma is not solved until the merging of blockchain technology. In this paper, a novel EV charging system is proposed for the cooperation of new energy companies and providing convenient charging services for users. In this system, charging information is managed and recorded by the company alliance based on consortium blockchain, which is tamper-resistant and multi-centralized. Meanwhile, a new smart contract is designed to balance the allocation of company’ charging users, so that the profits of different new energy companies could be fairer. To equilibrate the interest of companies and EV users, a Bio-Objective Mixed-Integer Programming model (BOMILP) is proposed as the mathematical logic of smart contracts. Furthermore, we proposed a new algorithm named Limited Neighborhood Search with Memory (LNSM) to support the implementation of smart contracts, which could make the smart contract running faster and has a better performance. At last, the proposed EV charging system and the smart contract are validated through a real case study with the EV charging data in Beijing, China.
An approximate dynamic programming approach for comparing firing policies in a networked air defense environment
2020, Computers and Operations Research
Citation Excerpt :
Their problem instance of interest is small enough to find an exact solution to the MDP model, and the authors also investigate the quality of ADP approaches. Gulpinar et al. (2018) formulate a stochastic dynamic task-resource allocation problem with retry opportunities, which generalizes many variants of the dynamic WTAP. The authors develop and test a constructive heuristic that sequentially assigns resources (e.g., interceptors) to tasks (e.g., incoming missiles).
An objective for effective air defense is to identify the firing policy for interceptor allocation to incoming missiles that minimizes the expected total damage to defended assets over a sequence of engagements. We formulate this dynamic weapon target assignment problem as a Markov decision process and utilize a simulation-based, approximate dynamic programming (ADP) approach to solve problem instances based on a representative scenario. Least squares policy evaluation and least squares temporal differences algorithms are developed to determine approximate solutions. A designed experiment investigates problem features such as conflict duration, attacker and defender weapon sophistication, and defended asset values. An empirical comparison of the ADP policies and two baseline policies (i.e., firing either one or two interceptors at each incoming theater ballistic missile (TBM)) yields several insights: the ADP policies outperform both baseline polices when conflict duration is short and attacker weapons are sophisticated; firing one interceptor at each TBM (regardless of inventory status) outperforms the tested ADP policies when conflict duration is long and attacker weapons are less sophisticated; and firing two interceptors at each TBM (regardless of inventory status), which is the United States Army’s currently implemented policy, is never the superlative policy for the test instances investigated.
The Weapon-Target Assignment Problem
2019, Computers and Operations Research
Citation Excerpt :
Often, WTA works are cited for their modeling or solution techniques, as they are applicable in many assignment problems with quantifiable rewards or costs and limited resources. Gülpınar et al. (2018) framed their model and solution technique for a dynamic resource allocation problem on much of the same literature that is outlined in Sections 2–4 of this survey. Çetin and Esen (2006) model and solve a media allocation problem with an objective function which, if Vj is the audience type value, pij is the probability that audience j views advertisement i, and decision variable xij is the number of advertisements of type i to assign to audience j, is the formulation S1.
Research addressing the Weapon Target Assignment (WTA) Problem, the problem of assigning weapons to targets while considering their effective probability of kill, began with Manne’s seminal work in 1958. In the years following, improved modeling and solution techniques have been developed, along with improvements in computing power, which have enabled researchers to consider more complex variants of the problem, to include models with fewer assumptions and models in which time is a parameter. Herein, we review the various model formulations, exact algorithms, and heuristic algorithms for the static and dynamic WTA. We place the formulations into a comparable form and use this form to provide insight into the evolution of the defense-related WTA problem. The solution methods are comparatively analyzed and an analysis of the influence of past work is conducted. More recent developments are introduced and discussed.

View all citing articles on Scopus

View full text

Innovative Applications of O.R.Heuristics for the stochastic dynamic task-resource allocation problem with retry opportunities

Highlights

Abstract

Introduction

Section snippets

Problem statement

A heuristic for task-resource allocation with retry option

Simulation-based dynamic programming approach

Computational experiments

Concluding remarks

European Journal of Operational Research

European Journal of Operational Research

Chinese Journal of Aeronautics

Operations Research Letters

Optimal multi-stage allocation of weapons to targets using adaptive dynamic programming

Optimization Letters

Exact and heuristic algorithms for the weapon-target assignment problem

Operations Research

A robust approach to the UAV task assignment problem

International Journal of Robust and Nonlinear Control

Business analytics for flexible resource allocation under random emergencies

Management Science

Model predictive control for dynamic unreliable resource allocation

Proceedings of the 41st IEEE conference on decision and control

Multiple task assignment for a UAV team

Proceedings of the AIAA guidance, navigation, and control conference

Resource allocation in multisite service systems with intersite customer flows

Management Science

Evolutionary decision-making for the dynamic weapon-target assignment problem

Science in China Series F: Information Sciences

Introduction to Evolutionary Computing

Exact solutions to task allocation problems

Management Science

Robust team decision making under uncertainty

International Journal of. Applied Decision Sciences

Heuristic planning with time and resources

Proceedings of the European conference on planning

Decomposition-based optimal market-based planning for multi-agent systems with shared resources

Proceedings of the workshop and conference proceedings

Sequential bundle-bid single-sale auction algorithms for decentralized control

Proceedings of the international joint conference on artificial intelligence

Multi-resource allocation scheduling in dynamic environments

Manufacturing and Service Operations Management

Innovative Applications of O.R.
Heuristics for the stochastic dynamic task-resource allocation problem with retry opportunities