Survey Paper
Grammar‐based autonomous discovery of abstractions for evolution of complex multi‐agent behaviours

https://doi.org/10.1016/j.swevo.2022.101106Get rights and content

Abstract

This paper presents a grammar-based evolutionary approach that facilitates autonomous discovery of abstractions to learn complex collective behaviours through manageable sub-models. We propose modifications to the design of the genome structure of the evolutionary model and the grammar syntax to facilitate representation of abstractions in separate partitions of a genome. Two learning architectures based on parallel and incremental learning are proposed to automatically derive abstractions. The evaluations conducted with three different complex task environments indicate that the proposed approach with both architectures surpass the performance of generic grammar-based evolutionary models by automatically identifying appropriate abstractions and generating more complex rule structures. The evolutionary process shows further performance improvements with the use of scaffolded environments which were used to train the models in increasingly complex environments across several stages. The results infer that the proposed approach incorporating grammatical evolution with techniques to autonomously discover abstractions can facilitate solving complex problems of agent systems in real-world domains.

Introduction

Multi-agent systems (MASs) are often recognised as models that help study and solve complex and intricate problems in diverse application domains. Artificial intelligence (AI) techniques such as reinforcement learning (RL) [1] and evolutionary computing [2] are commonly used to model such systems to be efficient and effective in addressing challenging real-world problems. However, the increasing complexity of the problems have rendered many multi-agent models infeasible in application domains due to sub-optimal solutions and prolonged learning times to design and train the models [3]. Further, such AI systems are often influenced by errors associated with modelling, and dynamics, disturbances, and uncertainties in real-world environments [4].

As a means of addressing this challenge, techniques have been used to modularise the learning process into manageable sub-components. Approaches such as incremental learning [5], transfer learning [6], and iterative learning control [7] are used not only in the multi-agent domain but also in other applied and theoretical domains such as robotics, batch processing, and periodic disturbance rejection problems [8], [9], [10]. These techniques utilise abstractions to support approaching problems through multiple intermediary steps instead of looking at the problem as a whole. An abstraction is a compact representation of knowledge about a task that helps learn a related more complex task [11]. The idea is inspired from the process adopted by humans in problem solving, as we exploit concepts and knowledge acquired in a simpler situation for learning tasks in a similar but more complex setting. In the context of multi-agent behaviour learning, we define an abstraction as a self-contained modular representation derived from the task or the environment associated with a problem. It is a lossy compression of an originally complex representation, and multiple such simpler compressed components combine to derive the complete solution to the problem.

However, the existing models that utilise such abstractions rely on assumed knowledge of the problem at hand. That is, the abstractions or the intermediary steps of learning are decided manually, and the agents are forced to learn behaviours addressing a pre-determined set of sub-problems. This approach does not support agents with the flexibility to test other routes which may be more efficient in reaching the final goal but not easily comprehensible to human cognition. Although manually configured abstractions can help an artificial learner to approach a challenging task, they may not lead to optimal behaviours at times or may take longer to reach the final goal [12].

Our work focuses on this limitation of the existing AI models that utilise abstractions. We identify that given enough flexibility over the process to the AI itself, an automated learning mechanism has the potential to derive its own set of abstractions. These abstractions may not immediately be evident to human comprehension, but could result in the expected level of performance. Therefore, the paper proposes an alternative technique through an automated evolutionary computing mechanism to derive the representations of sub-behaviours to achieve complex tasks.

We adopt a novel grammar-based evolutionary model with modifications to its genome structure to support the autonomous evolution of the abstractions. Grammatical Evolution (GE) [13] is an evolutionary mechanism inspired by genetic programming. It adopts a genome encoding mechanism that represents behaviour rule structures in binary form which can be mapped to the behaviour syntax based on a grammar. We incorporate GE in an automated mechanism to distribute the learning load across more manageable sub-components. In contrast to existing models, we investigate techniques to give more flexibility over the process to the learning algorithm itself to reduce the bias associated with assumed knowledge of the modules and allow the evolutionary model to autonomously decide the number and the functions of the abstractions in reaching the original complex task. In doing so, we propose modifications to the original GE model with: a genome partitioning technique to facilitate evolution of abstractions in separate partitions; two novel learning architectures based on parallel and incremental learning; and abstraction derivation techniques based on decomposition of the behaviour and environment into simpler components.

The following contributions listed are elaborated in the successive sections of the paper describing the techniques and evolutionary procedure employed in addressing the said issues:

  • 1.

    A grammar-based evolutionary model that can autonomously evolve abstractions with a MAS is proposed to achieve complex goals.

  • 2.

    Abstraction deriving techniques are proposed based on two factors: decomposing the agent behaviour into multiple components (behaviour decomposition); and decomposing the environment that the agents behave in (environmental scaffolding).

  • 3.

    Two learning architectures are experimented to evolve abstractions: incremental learning (learning abstractions incrementally) and parallel learning (learning all abstractions simultaneously)

  • 4.

    The performance of evolved behaviours is compared against generic evolutionary models where abstractions are inspired by handcrafted subgoals rather than being automatically derived, in three different complex task settings.

We conduct experiments with three different complex tasks designed to evaluate the abstraction learning model. Evaluations are conducted to evaluate the proposed evolutionary model that autonomously evolve abstractions against multiple criteria: (1) compare the performance of the model with and without restrictions on the number of abstractions it can automatically generate; (2) compare the performance against generic evolutionary models and models where genome partitioning is used but with no autonomous evolution; (3) evaluate the contribution of different abstraction derivation techniques based on behaviour and environment decomposition; and (4) evaluate the complexity of derived behaviours under different conditions.

The experimental evaluations demonstrate that the proposed GE model incorporated with the novel learning architectures has potential to autonomously identify suitable behaviour modules to reach complex task requirements with a higher performance compared to generic evolutionary models. Both parallel and incremental learning architectures achieved similar results demonstrating equal competency to autonomously evolve behaviours to solve complex tasks. The models demonstrated higher performances when the learning process is supported by environmental scaffoldings as the models tend to explore a wider range of rule structures with diverse and higher complexity levels when scaffoldings are used. Further, the rule structures evolved with the proposed autonomous abstraction techniques are more complex than the generic models that do not use abstractions. This indicates that the proposed model has the capacity to generate more intricate behaviours that can address multifaceted problems encountered in real world domains.

The rest of the paper is organised as follows. Section 2 reviews the existing literature related to the use of abstraction learning and grammar-based evolution. Section 3 introduces the problem statement that is addressed within this paper with regard to learning complex multi-agent behaviours through abstractions. The derived grammar-based evolutionary framework is presented in Section 4 and the parallel and incremental learning architectures that are used with the evolutionary framework are introduced in Section 5. Section 6 illustrates the experimental setups and the evaluation results. Finally, Section 7 concludes the paper with a discussion of the results and possible future directions.

Section snippets

Related work

This section elaborates the limitations of existing AI techniques that led towards autonomous derivation of abstractions and the adaptation of a grammatical evolution based learning model.

Problem statement

In this section we introduce the specific problem that is addressed within this paper. We use the term goal to refer to the state that the agent system is required to be in to achieve a successful outcome. Subgoals are intermediate targets that should be achieved by the agent system in order to reach the ultimate goal state. The goal and subgoals are out of the control of agents; rather, they are part of the environment they are performing in. A sub-problem refers to an abstraction (a simpler

Proposed grammar‐based evolutionary framework

Figure 1 illustrates how the autonomously discovered abstractions are utilised in generating complex multi-agent behaviours.

If a direct evolutionary approach was used, it would evolve the behaviour rules in a single genome targeting the global objective function F(ΨG) for the goal state G in environment Eτ. But with the proposed autonomous abstraction discovery approach, the evolutionary process is given the flexibility to decompose the global behaviour among multiple rule components by

Proposed learning architectures

The evolutionary algorithm discussed in the previous section is implemented with the two abstraction learning architectures: parallel learning; and incremental learning. Two techniques are used to guide the evolutionary process based on: subgoals identified with behavioural decomposition; and based on both subgoals and increasingly complex environments identified with behavioural decomposition and environmental scaffolding. When behavioural decomposition is used, the fitness function is

Experimental evaluations

The experimental evaluations are focused on three main areas of interest: the performance of the proposed learning architectures in automatically discovering abstractions; the impact and implications of the automatically derived modules and the use of behavioural decomposition and environmental scaffoldings to support the evolutionary process; and the complexity of the rule structures evolved.

Discussion and conclusion

This paper explored a novel mechanism for autonomous discovery of abstractions based on a grammar-based evolutionary approach. In contrast to other multi-agent automation techniques, GE has the unique ability to evolve complete rule structures reducing human intervention in the rule generation process. A combination of this ability and the proposed modifications to the evolutionary model and proposed abstraction learning architectures has shown potential to autonomously identify suitable

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (51)

  • S.J. Pan et al.

    A survey on transfer learning

    IEEE Trans Knowl Data Eng

    (2010)
  • S. Arimoto et al.

    Bettering operation of robots by learning

    J Robot Syst

    (1984)
  • H. Tao et al.

    Robust PD-type iterative learning control for discrete systems with multiple time-delays subjected to polytopic uncertainty and restricted frequency-domain

    Multidimens Syst Signal Process

    (2021)
  • R.W. Longman

    Iterative learning control and repetitive control for engineering practice

    Int J Control

    (2000)
  • E.A. Mcgovern et al.

    Autonomous discovery of temporal abstractions from interaction with an environment

    (2002)
  • A.L. Christensen et al.

    Incremental evolution of robot controllers for a highly integrated task

  • C. Ryan et al.

    Grammatical evolution: Evolving programs for an arbitrary language

  • O. Buffet et al.

    Incremental reinforcement learning for designing multi-agent systems

    Proceedings of the Fifth International Conference on Autonomous Agents

    (2001)
  • J. Xu et al.

    Learning multi-agent coordination for enhancing target coverage in directional sensor networks

    Adv Neural Inf Process Syst

    (2020)
  • A. Ma et al.

    Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning

    Auton Robots

    (2020)
  • H.T. Nguyen et al.

    A hierarchical deep deterministic policy gradients for swarm navigation

    2019 11th International Conference on Knowledge and Systems Engineering (KSE)

    (2019)
  • M.A. Montes de Oca et al.

    Incremental social learning in particle swarms

    IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)

    (2011)
  • T. Shibata et al.

    Sensor-based behavior using a neural network for incremental learning in family mobile robot system

    Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN’94)

    (1994)
  • Y. Bengio et al.

    Curriculum learning

    Proceedings of the 26th Annual International Conference on Machine Learning

    (2009)
  • J. Pugh et al.

    Parallel learning in heterogeneous multi-robot swarms

    2007 IEEE Congress on Evolutionary Computation

    (2007)
  • View full text