Foraging theory for autonomous vehicle speed choice

doi:10.1016/j.engappai.2008.10.017

Engineering Applications of Artificial Intelligence

Volume 22, Issue 3, April 2009, Pages 482-489

https://doi.org/10.1016/j.engappai.2008.10.017 Get rights and content

Abstract

We consider the optimal control design of an abstract autonomous vehicle (AAV). The AAV searches an area for tasks that are detected with a probability that depends on vehicle speed, and each detected task can be processed or ignored. Both searching and processing are costly, but processing also returns rewards that quantify designer preferences. We generalize results from the analysis of animal foraging behavior to model the AAV. Then, using a performance metric common in behavioral ecology, we explicitly find the optimal speed and task processing choice policy for a version of the AAV problem. Finally, in simulation, we show how parameter estimation can be used to determine the optimal controller online when density of task types is unknown.

Introduction

Consider a vehicle that searches through a territory for tasks, and let probability of task detection depend on vehicle speed (e.g., sensor bandwidth or hysteresis requirements may prevent detections at high search speeds). On detecting a task, the vehicle may choose to process it. Both searching and processing are costly (e.g., monetary cost of depleted vehicle fuel), and search costs depend on search speed. However, the designer rewards the vehicle for processing. Hence, good vehicle control policies will set search speed and determine tasks to process in a way that balances rewards and costs.

Our vehicle description fits a variety of robotics, queueing, and other engineering applications well (e.g., Quijano et al., 2006; Passino, 2002, Passino, 2005), and so we refer to it as an abstract autonomous vehicle (AAV). The AAV is a generalization of the solitary forager from optimal foraging theory (Stephens and Krebs, 1986). Optimal foraging theorists assume that natural selection results in behaviors that optimize Darwinian fitness, and so they use analysis of proximate fitness functions to explain observed behaviors in nature. Lately, methods from optimal foraging theory have been used to analyze human behavior in nature (e.g., Rowcliffe et al., 2004) and on the Internet (Pirolli and Card, 1999; Pirolli, 2005, Pirolli, 2007). Following the example of Andrews et al., 2004, Andrews et al., 2007, we apply similar methods to AAV policy design.

Early foraging theory by Schoener (1971) postulates that optimal behaviors are found on a continuum with foraging time minimization and net energetic intake maximization at opposite ends. These strategies allow time for other activities (e.g., predator avoidance and reproduction) while preventing starvation. Later, Pyke et al. (1977) argue that lifetime energetic rate maximization is consistent with this idea, and Charnov and Orians (1973) develop several rate maximization models that continue to be used today (Charnov, 1976a, Charnov, 1976b; Stephens and Krebs, 1986). As discussed by Houston and McNamara (1999), rate maximization is equivalent to minimizing the opportunity cost of each foraging decision. In our context, foraging is equivalent to task processing, and so net energetic intake is equivalent to net reward gained. Hence, we develop AAV policies that maximize long-term net rate of reward.

The present work combines and extends work by Andrews et al. (2004), which extends foraging theory to engineering, and Gendron and Staddon (1983), which modifies standard foraging models to include speed effects. Our AAV model subsumes those models while also including more general search and processing costs. Under mild assumptions on detection probabilities, we give an analytical solution for the optimal AAV behavior in an environment with an arbitrary number of task types. Additionally, we use computer simulation to show how online parameter estimation leads to convergence to the optimal AAV policy. While this more general AAV model can be useful in ecological analysis, we focus on its application to engineering design.

The remainder of this paper is organized as follows. In 2 Simple AAV model, 3 Speed-dependent AAV model, the foraging-based AAV model is presented. Analytical optimization results are given in Section 4, and these results are validated using results from a fixed-wing airborne vehicle simulation in Section 5. Finally, in Section 6, we make some concluding remarks and suggestions for future work.

Section snippets

Simple AAV model

First, we present an AAV model that does not explicitly depend on search speed. This model is based upon the prey model described by Stephens and Krebs (1986). The solitary AAV moves through an environment searching for tasks drawn from a known set of task types. Tasks from each type are found according to a Poisson process (i.e., the AAV faces a merged Poisson process). When a task is encountered, the AAV decides whether to process or ignore it. Processing each task takes time but rewards the

Speed-dependent AAV model

We assume that the AAV speed is described with the following parameters:

•
$u_{\min} \in {x \in R : x ⩾ 0}$ : the minimum possible AAV speed given in length units per time units,
•
$u_{\max} \in {x \in R : x ⩾ u_{\min}}$ : the maximum possible AAV speed given in length units per time units, and
•
$u \in [u_{\min}, u_{\min}]$ : constant speed of the AAV given in length units per time units.

In the following, we enhance the simple AAV model by integrating the speed

u

into existing AAV parameters.

Optimization of net rate over speed and task choice

Stephens and Krebs (1986) show that the optimal task pool is determined by the encounter rates of different task types in that pool. Because encounter rate varies with speed, the optimal task pool will vary with speed. However, because Eqs. (12), (13), (14), (15), (16), (17) are functions of the task pool, the optimal AAV speed varies with task pool. So, solving for optimal speed and optimal task pool cannot be done separately; these two problems must be solved simultaneously.

Simulation

In a real scenario, an AAV may have limited information about its environment. Above, it is assumed that the task-type densities (i.e., the encounter rates) are unknown a priori. Here, the AAV estimates these densities online. Based on the estimates, the explicit solution for the optimal speed and task pool from Section 4 is used by the AAV. We show that this approach quickly converges upon the optimal task-type and speed choice for the environment, which validates the theory and demonstrates

Conclusions

Methods inspired by foraging theory can be used in the design of decision-making strategies for autonomous vehicles. We have shown that the classic prey model can be enhanced for engineering applications to include the impact of speed-dependent sensor limitations as well as speed-dependent fuel cost. The enriched prey model can be used to predict the optimal task-type choice policy and speed. When there is limited information (e.g., when the density of tasks is unknown) the information can be

Acknowledgments

We thank Professor Thomas A. Waite of the OSU Department of Evolution, Ecology, and Organismal Biology, for teaching us about behavioral ecology and for stimulating discussions on the use of that theory in engineering applications. We also appreciate the helpful comments of an anonymous reviewer.

References (21)

E. Charnov
Optimal foraging: the marginal value theorem
Theoretical Population Biology
(1976)
Andrews, B.W., Passino, K.M., Waite, T.A., 2004. Foraging theory for decision-making system design: task-type choice....
Andrews, B.W., Passino, K.M., Waite, T.A., 2007. Social foraging theory for robust multiagent system design. IEEE...
D.P. Bertsekas
Nonlinear Programming
(1995)
E. Charnov
Optimal foraging: attack strategy of a mantid
American Naturalist
(1976)
Charnov, E., Orians, G.H., 1973. Optimal foraging: some theoretical explorations. Ph.D. Thesis, University of...
L.E. Dubins
On curves of minimal length with a constraint on average curvature and with prescribed initial and terminal positions and tangents
American Journal of Mathematics
(1957)
Gendron, R.P., 1982. The foraging behavior of bobwhite quail searching for cryptic prey. Ph.D. Thesis, Duke...
R.P. Gendron et al.
Searching for cryptic prey: the effect of search rate
American Naturalist
(1983)
A. Houston et al.
Models of Adaptive Behavior
(1999)

There are more references available in the full text version of this article.

Cited by (0)

¹: Supported in part by the AFRL/AFOSR Collaborative Center of Control Science (Grant F33615-01-2-3154).

View full text

Foraging theory for autonomous vehicle speed choice

Abstract

Introduction

Section snippets

Simple AAV model

Speed-dependent AAV model

Optimization of net rate over speed and task choice

Simulation

Conclusions

Acknowledgments

Theoretical Population Biology

Nonlinear Programming

Optimal foraging: attack strategy of a mantid

American Naturalist

On curves of minimal length with a constraint on average curvature and with prescribed initial and terminal positions and tangents

American Journal of Mathematics

Searching for cryptic prey: the effect of search rate

American Naturalist

Models of Adaptive Behavior