Introduction

The present work proposes a model based on Markov Logic Network (MLN) [16] for representing emergency situations involving smoke and fire on offshore petroleum platforms. The model is tested for two important situations, FIRE and EVACUATE. In the FIRE situation, fire is observed due to smoke at some place on the platform, and all workers need to muster to their primary muster station. In the EVACUATE situation, the fire is escalated so that some escape routes to the primary muster station are blocked and all personnel needs to muster at the lifeboat or alternative muster station. The purpose of this work is to have a model that can be used by a software agent so that the agent can exhibit human-like situation awareness (SA). Such agents can subsequently be used, for example, in training simulators to enrich trainees’ experience by showing them various scenarios in which the agent shows recognition of different situations (to makes various decisions). A participant can learn from the agent what information is important in a given scenario for correct SA.

Representing the emergency response of agents operating in a virtual environment (VE) is a challenging and active research area. Emergencies on board can arise from several factors, among which accidents are on top [28]. The Cullen Report [10] following the Piper Alpha disaster has clear recommendations for operators to perform a risk assessment of ingress of smoke or gas into the accommodation areas. Klein [31] says that VE training is important for the crew in many respects, for example, because trainees get opportunities to learn from and about each other as a team, and also to learn about the cues that unfold in an evolving training scenario. Thus, a VE has an essential role as a training environment, and agents are important elements of VE fidelity [36].

Situations are highly structured parts of the world that span a limited space and time, and people talk about them using language. They are composed of objects having properties such that the objects stand in relations with one another [4]. An agent’s world can be considered as a collection of situations, and the agent should be able to discriminate among them. Devlin [14] extends Barwise and Perry’s Situation Theory [3, 5] and proposes a representation using a concept called infon, which is an informational item of the form “objects a1,…, an do/do not stand in the relation P”. A situation, formally, is then some part of the world that is supported by a set of infons.

This work considers SA as being a phenomenon that refers to the information flow [13] from a situation to a subject such that the subject can reason about the situation. Endsley’s [17] model of human SA describes this information flow as a process with three successive levels. Level-1 begins when a person starts perceiving information as environmental cues. This part of Endsley’s SA model has a direct resemblance with acquiring information about the presence of object a1an for developing relevant infons in a situation. Level-2 in Endsley’s model explains that the person should be able to extract meaning from what has already been perceived. Level-3 of the model says that the meaning of cues should enable a person to foresee something shortly. Kokar et al. [32] developed an ontology, called situation theory ontology (STO), that defines semantics for situation theory by including a meta-class describing the types of things (individuals, individual’s properties and relations among them) that constitute a situation as a type in accord with Barwise and Devlin’s situation semantics. Inference on the available facts (infons) with some background knowledge about the objects and their relations within the ontological framework not only supports level-2 of Endsley’s SA model but also gives potential to achieve level-3 SA. For example, if an agent knows that fire lit in an oil container should not be put out with water, only then can the agent preempt somebody from doing so. For that, the agent should project the current information about the position of the fire and the water source approaching the oil container into a future state using a rule that exploits some predicate like fireEscalates(oil, water). STO satisfies many characteristics of Endsley’s SA model, and it was implemented in the Web Ontology Language (OWL) using the full profile (OWL-Full). Now that OWL changed in 2009 and the support for OWL-Full, which is required to fulfill the theoretical requirements of Barwise and Devlin’s approach to situation modeling, is unavailable, STO is difficult for use as a platform for modeling SA.

The concept of context in the literature related to artificial intelligence (AI) is similar to the situation in the SA literature. Sowa [60, 61] uses conceptual graphs (CG) to represent context or situations. CGs are an extension of Peirce’s existential graphs (c. 1882) with features taken from semantic networks of AI and linguistics. CGs are bipartite graphs where boxes are used to represent concepts, and circles are used to show relations. As a simple example, a situation “Cat is on mat” can be represented in a CG using a linear notation as: [Cat] → (On) → [Mat], where Cat and Mat are two concepts (each for one object/individual in the real world) related to each other by the relation On. Sowa [61], and Akman and Surav [1] say that both context and situation are the same notions. Kokar et al. [32] report that contexts (situations) in AI are dealt with using predicates such as isa(c, p) to mean that the proposition p holds true in the context c.

Predicates in First-Order-Logic (FOL) are building blocks of the system based on it. CG is computationally equivalent to FOL [61]. Rules in FOL are considered as hard constraints in that a world is thought to exist only when the rules are valid. This is contrary to situations in real life. A rule like smoke causes cancer in FOL is always valid, so an agent that smokes certainly has cancer. But this is not the situation in the real world where rules are violated, and the violation is only a matter of limitation regarding the frequency of cases where the rule is not observed.

Domingos and Lowd [15] consider FOL rules as hard constraints that limit the progress in AI research, and offer a method to describe soft rules using MLNs. Soft rules are formed by assigning weights to the FOL rules in MLNs. The weights determine how likely the entities of the world might follow a rule. The higher the value of the weight, the harder the rule becomes. The present work uses MLNs to construct a model for situations in emergency scenarios, particularly those arising on offshore petroleum platforms. The purpose is to create software agents for training in VEs, where an agent exploits environmental cues to understand different emergency situations. This way, the agent can be given an ability to construct a repertoire of situations that it observes. Such agents can be expected to make experience-based decisions when exposed to emergencies in a solo or a group training environment. Applications of such agent models can be found in many fields, including pilot behavior modeling [24] during midair encounter, game programming, and so on.

Being aware of a situation is not merely an outcome of a typical feature matching mechanism, as some authors suggest [43]. Awareness helps categorization of things according to certain common grounds. In other words, recognition of a situation, should mean first, to model a situation using a knowledge representation schema, and second, to devise a mechanism whereby inference can be performed on the stored knowledge to extract new knowledge. Since MLNs support inference—even on incomplete data—the resulting model of SA has some resemblance to Endsley’s SA model. Moreover, as MLNs allow conflicting rules, it is a more natural choice for modeling situations in which cues at different times and space could take different meanings.

Social agents can interact with human participants during an emergency egress scenario to form a group-training situation to learn from human responses and then to guide other computing modules for evaluation of human responses. Participants can also learn from these agents to respond in a scenario. The use of these agents in training exercises reduces the necessity of having a large number of real people in a large-scale group training [40]. Also, the rehearsals with agents are more effective than with human counterparts because of the consistent, usually scripted, agent behavior. A more realistic approach is to replace the scripted agent’s behavior to more natural, human-like behavior so that a participant can trust the agent responses and may consider it a colleague, rather than a robot. The works in [11, 12] focus on route learning for agents and propose a model where an agent can exhibit behavior that is similar to a human participant while learning a new escape route. Risks associated with human responses during an evolving emergency are assessed in [42]. The authors assert that hazards (like fires, smoke), weather condition, malfunctioning equipment, and inadequate emergency preparedness such as that related with the recognition of platform alarms are important factors that affect the human response. Musharraf et al. [38] propose a methodology to account for individual differences in agent modeling for emergency response training. The problem of modeling SA for such agents is still another important area that has potential implications in the way agents make decisions in evolving emergencies. Chowdhury [8] explores various situations that occur on offshore rigs, platforms, and installations. The author explains how fire and evacuation situations are indicated on different platforms.

Previous works” section describes some recent work in situation awareness. “A method to model situation awareness” section describes the proposed methodology to model SA based on MLN. “Case studies: SA during offshore emergency scenarios” section describes a case study and experimental results that serve to assess the validity of the proposed model. “Results and discussion” section contains a discussion of the results, and “Conclusion” section presents concluding remarks and future directions.

Previous works

With the increasing demand of intelligence based systems, encompassing from smart cars to smart homes, the use of situation recognition has become a focal point in research because of its importance in enabling artificial intelligence. Récopé et al. [52] attempt to discover the reasons for interindividual differences in volleyball players’ defensive behavior during different identical situations. The authors raised an important question, “Might other dimensions of situation assessment, which have so far not been studied to any great extent, be involved?” Based on an experiment involving two volleyball teams, the authors conclude that an individual’s activity is governed by a specific norm that organizes, orients and enhances understanding of the actions as a coherent totality. In other words, there is a subconscious sensemaking that individuals use in order to determine the relevance of cues corresponding to different situations.

In order to assess network security within an Internet of Things (IoT), Xu et al. [70] propose an ontology-based model for SA for network security of IoT. Again, ontological knowledge helps identifying concepts and relations in order to understand what type of situation is currently being observed. An IoT security situation is described by employing knowledge about the context, attack, vulnerability, and network flow. A model of how SA spreads among agents in a multiagent system is presented in [6]. Nasar and Jaffry [41] study this work [6] and extend it, using Agent Based Modeling (ABM) and Population Based Modeling (PBM) techniques, by incorporating trust in the SA model. Thus, the resulting agents’ beliefs and decisions about the environment have been shown to be affected by their trust in other agents. Johnson et al. [27] addressed the issue of decrease in SA when the flight control mode changes from automatic to manual mode. The authors proposed a cognitive model based on “perceive-think-decide-do” scheme that estimates the effects of change in the flight mode on operator behavior. The primary contribution of the proposed model is an attention executive module, which is responsible to detect changes in attention on specific control loops based on changes in priorities. The authors of [30] develop a model that uses social media posts and process them, by clustering consistent posts, in the way that a user can gain more better insights by reading different views (or world view) that the system has generated. This approach is not particular to model situation awareness for agents; however, people can assess a situation, described through posts, better by reading the world views about the posts on tweeter or any other social media platform that exploits the proposed technique.

Yang et al. [71] develop a probabilistic model for robots to decide about a role that otherwise would have been fulfilled by a human had there been the same situation. Situations are classified here as: easy, medium, and hard. The model takes input as 2D and 3D images, and the robot model should get its role first, and then decides upon actions per role and the situation as recognized through the images. Roles are recognized by fusing the results of two indicators, the distance-based inference (DBI), and the knowledge-based inference (KBI). The DBI uses a relative distance between humans and mission critical objects to determine the probability of a possible role. The KBI uses a Bayesian network that integrates human actions and object existence to determine a possible role. The final role is determined as a fusion of DBI and KBI by using information entropy measure. The actions of a person that is detected as target, because he is carrying the mission critical object, is a major contributor of changes in the situation. Situation levels are determined by using the target person actions (moving, stationary) and the relative position of several mission related entities at some time t by using a Bayesian network. Actions are decided based on the situation level and the inferred role. The proposed approach is robust in recognizing roles because of the fusion of different inference results, it would be useful if situations to be encountered are of fundamentally the same type, so that they can be classified as easy, medium, and hard. For example, what would a robot do if the situation is complex, as is the case of an offshore emergency where the environment is cluttered with many objects, crew, alarms, exit signs, announcements, and so on. In such conditions, different situations are possible, and the question of classifying a situation into easy, medium, and hard seems an idealistic assumption. Hu et al. [24] developed model for predicting pilot behavior during midair collision recognition-primed decision model. Features extracted from the environment are compared with the stored attributes of situations, and an already encoded situation is retrieved based on a Bayesian classifier as a similarity criterion.

Naderpour et al. [39] developed a cognition-driven SA support system for safety–critical environments using Bayesian networks. The system consists of four major components to deal with (1) receiving cues from environments, (2) assessing situation based on dynamic Bayesian network and fuzzy risk estimation method, (3) recovering from a situation, that advises measures to reduce the risk of a situation, and (4) an interface for better interaction with people. Another study [59] categorizes maritime anomalies, such as speeding of a vessel, according to the levels in the JDL data fusion model [35]. Szczerbak et al. [63] use conceptual graphs to represent ordinary real-world situations and introduce a method to reason about similar situations. Liu et al. [34] propose an information fusion model with three layers for event recognition in a smart space where sensory data is collected in the first layer, context is represented as MLN in the second layer. The third layer maps the contextual information of the second layer to corresponding events. To fuse uncertain knowledge and evidence Snidaro et al. [58] develop an MLN based SA model for maritime events.

Gayathri et al. [19] use MLN to develop an ontology that can be used to recognize activities in smart homes. The purpose is to detect an abnormal activity (or a situation) and inform the remote caretaker. Using a technique called Event Pattern Activity Modeling [20], observations collected through sensors have been parsed into concepts in an ontology, and the relevant descriptive logic rules are generated. These rules are then converted into FOL equivalents, and weights are assigned to FOL rules to develop the MLN based activity model. Given the observations through sensors, the MLN activity model can be used to suggest different interpretations of the observed data in a probabilistic sense. The use of MLNs enable representation of cyclic dependency among the rules, which is a major advantage of MLNs over Bayesian networks.

A method to model situation awareness

Take S to be a countable set and ℘(S) to define the set of all subsets of S, where the points of S are sites, each of which can either be empty or occupied by an object (such as a formula in a logical framework or a particle as it appears in the statistical mechanics literature). The sites of S can be represented by binary variables X1, X2, …, Xn. The subset Λ ∈ ℘(S) is regarded as describing a situation when the points of Λ are occupied and the points of S − Λ are not. The elements of ℘(S) are sometimes called configurations. The set S, representing the sites, may have some additional structure. As sites are connected, S can be considered as forming an undirected graph G [48], so the points of S are the vertices of some finite graph G(S, E), where E is the set of edges. The present work involves modeling a probability measure (defined in the following subsections), restricted to the sample space Σ = {0, 1}S, having a kind of spatial Markov property given in terms of neighbor relations of G [22], called a Markov random field [25, 29, 45].

Definition

G(S, E) is countable and does not contain multiple edges and loops. If x, yS and there is an edge of the graph G between x and y, then x and y are considered neighbors of each other [48]. Formally, the function f: S × S → {0, 1} is given by

$$f\left( {x, y} \right) = \left\{ {\begin{array}{l} {1\;\;\;\;\;\;{\text{if}}\;x\;{\text{and}}\;y\;{\text{are}}\;{\text{neighbors}},} \\ {0\;\;\;\;\;\;{\text{otherwise}}} \\ \end{array} } \right.$$
(1)

Definition

If Λ ∈ ℘(S) then the boundary ∂Λ ∈ ℘(S) is defined as:

$$\partial \Lambda = \{y \in S -\Lambda | f\left( {x, y} \right) = 1,\; {\text{for}}\;{\text{some}}\;x \in \Lambda \}$$
(2)

A Markov network (MN) is composed of G and a set of potential functions ϕk. G has a node for each variable, and MN has a potential function for each cliqueFootnote 1 in G. A potential function is a non-negative real-valued function of the configuration or state of the variables in the corresponding clique. The joint distribution of the variables X1, X2, …, Xn can be developed to understand the influence of a site, i.e., a variable, on its neighbors [50] as defined below:

$$P\left( {X = x} \right) = \frac{1}{Z}\mathop \prod \limits_{k} \phi_{k} \left( {x_{\left[ k \right]} } \right)$$
(3)

where x[k] is the configuration of the kth clique, i.e., the values of the variables in the kth clique. Z is partition function for normalization, \(Z = \mathop \sum \nolimits_{x \in \Omega } \mathop \prod \nolimits_{k} \phi_{k} \left( {x_{\left[ k \right]} } \right)\).

Markov Logic Network

Because a random variable assigned with a value can be considered as a proposition [23], Domingos and Richardson [16] define MN by first considering the variables as rules/formulas in FOL. Unlike FOL, a formula in MLN is assigned a weight (a real number), not just the Boolean true or false. Formally, an MLN L is defined as a set of pairs (Fi, wi) with Fis being the formulas and wis being the weights assigned to the formulas.

If C = {c1, c2, …, c|C|} is the set of constants or ground predicates (the facts), then L induces a Markov network ML,C such that the probability distribution over possible worlds x is given by:

$$P\left( {X = x} \right) = \frac{1}{Z}\exp \left( {\mathop \sum \limits_{i} w_{i} n_{i} \left( x \right)} \right) = \frac{1}{Z}\mathop \prod \limits_{i} \phi_{i} \left( {x_{\left[ i \right]} } \right)^{{n_{i} \left( x \right)}}$$
(4)

where ni(x) is the number of true groundings of Fi in x, x[i] is the state or configuration (i.e., the truth assignments) of the predicates in Fi, and \(\phi_{i} \left( {x_{\left[ i \right]} } \right) = e^{{w_{i} }}\).

The FIRE and EVACUATE emergency situations

Fire and evacuate are among the important types of emergencies that occur on offshore petroleum installations [62]. Chowdhury [8] describes various emergencies, such as fire/blowout, evacuate, H2S release, and the types of alarms used on different offshore rigs. A fire may erupt due to many reasons, such as a gas release near an igniting source, or an electrical spark near a fuel line. Explosions also result in fires. In any case, if a fire event occurs a fire alarm is raised, and people on board must leave their work and report to their designated muster station, which is usually their primary muster station. This type of situation is called a FIRE situation, and it will end when an all-clear alarm sounds, which means that the fire has been taken care of and the people can now return to their duties. In case a FIRE situation escalates, meaning that the fire spreads and blocks various paths so that personnel’s safety could be further compromised, an EVACUATE situation may come into effect, and this new situation is communicated to people by another alarm, different from the fire alarm. In the EVACUATE situation, people must report to their designated secondary muster station, the lifeboat station, from where the final evacuation from the platform can proceed.

Knowledge representation of emergency situations

An interesting aspect of modeling a situation is to identify the factors that lead to the situation of interest. Typically, a situation involves preconditions or events, some of which are observable, and some are not directly visible [58]. Since MLNs are based on FOL rules, the basic methodology as described in [15, 16], and followed here, requires developing FOL rules, followed by assigning the weights, and finally performing the required inference. Nonetheless, there is no straightforward way of writing FOL rules for a knowledge domain. Writing FOL rules requires experience and thorough domain knowledge. Also, the developed FOL rules must fulfill some criteria of acceptance. For example, a rule like “smoke causes cancer” has been given serious attention among medical practitioners [9] since the constitution of a study group in 1957 [56]. This group was appointed by several institutes, including the National Cancer Institute, and it concludes, by considering the scientific evidence, that cigarette smoking is a causative factor for a rapid increase in the incidence of human epidermoid carcinoma of the lung.

Figure 1 proposes a methodology that incorporates the basic steps of constructing an MLN iteratively so that each rule could be judged against some heuristic criteria of acceptance, for example, by assigning the weights to rules through empirical findings using a learning algorithm [53] and then seeing if the weights make sense. In any case, if many of the rules come up as negatively weighted, then such a knowledgebase will have little practical value, and one must look into the training samples and/or the rules themselves. In the former case, it is possible that the training sample includes little evidence where the rules were successful. In the latter case, it is possible that the rules were not constituted correctly, regarding the specification of different predicates, their connections using logical connectives, and their implication into a consequent. In short, one must go back and update the rules and/or training–testing data sample, as shown in Fig. 1 until the desired results are met. The choice of a learning algorithm is also a point to consider. Since discriminative learning does not model dependencies between inputs within the training sample, it often produces results [53] better than generative learning techniques. Using the testing samples as evidence, the probability that a query predicate holds is estimated by employing an inference mechanism, such as by using the MC-SAT algorithm [46].

Fig. 1
figure 1

The proposed methodology to develop a situationally aware agent model based on MLN

Table 1 lists the variables studied in this work for SA about the situations discussed earlier in “The FIRE and EVACUATE emergency situations” section, the FIRE situation, which asks all personnel to move to the primary muster station, and the EVACUATE situation, which involves escalation of a fire into a larger fire that obstructs the primary escape-route leading to the primary muster station, thereby necessitating re-routing to the alternative or lifeboat station. A set of FOL rules are proposed in Table 2 so that an agent recognizes these situations like the way a human counterpart recognizes them. The preconditions (antecedents of FOL rules) used here are common among experts and have been suggested in earlier studies [8, 18, 49, 55, 57, 62, 64,65,66, 68]. The query predicates determine the probability of recognizing alarms, having a FIRE situation, having an EVACUATE situation, and having some (unknown) situation given the evidence predicates.

Table 1 Variable/predicate names and description
Table 2 The FOL rules that are showing the knowledge base for basic emergency preparedness

Reasoning

The variability in the emergency alarm systems and indicators used at different offshore installations is a source of confusion when a real emergency occurs, especially for personnel who frequently move from one to another platform for performing special tasks. Alarm recognition is considered a major contributor to the awareness of an emergency type [8]. Different alarms mean different situations requiring a different course of actions by the personnel onboard. The scope of the present work is limited to SA and does not extend to finding a suitable course of action in case of an emergency. Recognition of alarms is something that cannot directly be observed unless the person is asked, so a search for further factors that indicate that an alarm has been recognized is required. An alarm cannot be recognized if it was not heard, whereas listening needs attention towards the alarm signal [51]. Emergency alarm signals are so loud that it is hard not to hear them, but that does not mean that people will always recognize which situation the present alarm is for. An agent can exploit rule # 1 in Table 2 to express the behavior of not recognizing an alarm if, for any reason, such as the inertial tendency of people to keep doing what they are doing [69], the agent does not listen to it. Several studies [49, 65] show that people do not start evacuating a building or moving to a muster location automatically when they hear alarms unless they are trained to do so, and there are some other factors or cues that lead them to act as needed in that situation.

Rule#2 uses two more factors to frame the conclusion of recognizing an alarm beside just listening. The first factor reflects a person’s ability to develop the intention of moving to the required muster station. The required muster station is referred to by the variable mloc that takes values from the set {MESSHALL, LIFEBOAT}. Literature shows that intention is an important cognitive state that affects one’s ability to participate in a decision-making process [7, 64]. Intention is modeled here as a predicate HITR that takes a value true if the agent develops the intention to move to mloc during a time interval t. An agent’s intention can be inferred by observing which route is taken up immediately after listening to the alarm. The agent can also be delayed in developing the intention to reach mloc and may require other cues for building up this intention. Therefore, to know if an alarm is recognized without the help of other cues, such as observing smoke, it is necessary to know when the agent develops the intention of moving to the required muster station after listening to an alarm. HITR is used in conjunction with the predicate BST that ensures the intention of moving to the muster location is developed before seeing a threat because if an agent sees a threat, it would be unclear if its intention of moving to mloc is due to the threat or the alarm. The probability of recognizing the alarm is determined by using the conjunction of the three predicates. If any of the antecedent predicates fail, the chances of recognizing the alarm will be reduced.

The variable ST (see Table 1) is used to indicate that the agent observes a threat. An agent who sees a threat (such as smoke or blowout) is highly likely to discover the type of emergencies involved (FIRE or EVACUATE). Rules # 3 and 4 say that an agent will be aware of ‘some’ emergency if it just listens to an alarm or observes a threat.

Public address (PA) announcements are also important cues for getting to know details about a developing situation [8, 18, 62, 68]. PAs are verbal announcements with clear words detailing the situation. The details include the location of a threat or hazard, what actions are needed, and what areas are affected. The agent can take advantage of the PA to learn about a developing emergency. However, this needs a focus on the words in the PA. The literature on distraction explains how people get distracted in different situations. Tutolo [66] says that children’s ability to listen without being distracted improves with age. Inattention to the available information has been studied for the offshore drilling environment in [57]. The authors discuss other factors, such as stress, that influence focus of attention by producing a narrowing or tunneling effect so that a person is left focusing on only a limited number of cues under some stressors. Tversky and Kahneman [67] call this cognitive tunnel vision. The predicate HFO is true when the agent has a focus on a PA being uttered. An agent that is engaged in all activities except what is communicated in the PA is defined to have no focus, whereas one that suspends its current engagements and begins performing the actions according to the PA is considered to have focused on the PA. Similarly, if an agent, while moving, suddenly changes its course because of instructions given in the PA a moment before, this also considered to have exhibited a clear sign of responding to the PA. In general, gestures can be noticed to determine if an agent has a focus on an ongoing PA or not. The predicate FPA is used to demonstrate the requirement of following the PA. If HFO is true, but FPA is false, it means that, though the agent had focused on the PA’s words, it is confused or does not have an understanding of the situation, and therefore, the agent is unable to follow the PA. Rule#5 is a disjunction of three different rules: the first determines SA about the emergency based on focus and understanding of PA, the second uses direct exposure to the threat/hazard, and the third is based on the recognition of alarms. This last disjunct in rule#5 uses the predicate KETA to link an alarm to the corresponding situation or emergency type because that is needed to conclude in the consequent predicate HSES. Rules # 6 & 7 are to ensure that FIRE and EVACUATE are two distinct types of situations, besides that EVACUATE may occur because of a fire [8, 62].

Rule # 8 says that if during some initial time interval t0 a FIRE situation is observed, and during some later interval t1 (where t0 \(\prec\) t1) this situation escalates to EVACUATE, then the FIRE situation will no longer exist during t1, although one may witness real fires during the EVACUATE situation.

Case studies: SA during offshore emergency scenarios

This work uses two case studies developed using the experiment performed in [55] to acquire training and testing data for SA during offshore platform egress scenarios so that the proposed model (in Table 2) can be judged against the empirical data. The objective of Smith’s experiment was to assess VE training effects on people’s ability to learn and respond during offshore egress scenarios involving fire hazards. The distribution of training of the participants and testing their performance is shown in Fig. 2. The experiment targeted six learning objectives: (1) establish spatial awareness of the environment, (2) routes and mapping, (3) emergency alarm recognition, (4) continually assess situation and avoid hazards on route, (5) register at temporary refuge, and (6) general safe practices such as closing the doors when there is an emergency alarm in effect due to fire or smoke hazard. There were three sessions with increasing complexity. Session 1 (S1) involved training, practice, and testing for the learning objectives 1, 2, 5 & 6, session 2 (S2) used scenarios involving the learning objectives 3, 5 & 6, and session 3 targeted the objectives 3, 4, 5 & 6. The experiment involved 36 participants divided into two groups: Group 1 contained 17, and Group 2 contained 19 participants. Group 1 was trained in several sessions, whereas Group 2 participants received only a single training session. The VE used in this experiment was All-hands Virtual Emergency Response Trainer (AVERT). AVERT is a research simulator of an offshore petroleum facility. It is used to train participants to improve their response should they face an emergency such as a fire or an explosion. The present work uses only the third and the fourth learning objectives because they deal with the SA the participants exhibited during each scenario. The data was obtained by a careful reading of the log files and watching the replay videos of session S3 recorded for each participant during the testing phase of the relevant scenarios.

Fig. 2
figure 2

(Source: Adopted from [55])

Training exposure to participants. Sessions S1, S2, and S3. The datasets are obtained from S3 for both groups

Situations in experimental scenarios

Smith’s experiment [55] involves emergencies in which, initially, there is a fire in the galley. After some time, the fire escalates so that the primary muster station, which is the mess hall on deck A of the platform, becomes compromised. An audible fire alarm (the General Platform Alarm, GPA) followed by the relevant PA is made right after the initial fire event. The escalation of the fire in the galley to fire in the mess hall is then announced by a Prepare to Abandon Platform Alarm (PAPA), followed by another PA. Initially, a participant is situated in their cabin (see the floor map in Fig. 3-1) when a GPA alarm activates, followed by a platform announcement. The PA announcement directs the participant to muster at their designated muster station, which is the mess hall on A-deck for a FIRE situation. Upon hearing the GPA, the participant needs to move out of the cabin and choose from the primary route (the solid lines, which goes through the main stairwell), or the secondary escape route (the dotted lines, which uses the external stairwell) to reach A-deck. The participants were trained to deal with these situations earlier using escape route training videos and instructions in the training session S1. While moving toward the mess hall, after a fixed interval of time t0, the participant receives a call to abandon the platform. This is the PAPA alarm, which indicates to the participants that they should immediately move to the secondary or alternative muster location, which is the lifeboat station at the starboard side of the platform (see Fig. 3-2). The time interval when PAPA is activated to the end of a scenario is termed t1. Thus, t0 is the time interval in which the participants get all cues related with the FIRE emergency, such as smoke in the stairwell, GPA alarm, and PA announcement that includes the words “fire in the galley”. Similarly, t1 is the time interval that starts when t0 expires and ends at the end of the scenario. During the t1 period, the participant receives cues related with an EVACUATE situation. The PAs use clear words as to what needs to be done in an emergency and what parts of the escape route are expected to be blocked due to fire or smoke. Although GPA and PAPA are activated at different times, indicating two different situations, the other environmental cues can be observed at any time during their lifetimes. For example, smoke in the main stairwell is considered as a cue for a FIRE situation. Some participants reached at this spot in the main stairwell after the PAPA was activated. Situations like these are complex because of confusion due to conflicting cues.

Fig. 3
figure 3

Floor map for decks A and C in AVERT simulator. A participant starts from Cabin (S) in part (1) and ends either at the mess hall or the lifeboat station in part (2) using external stairwell or main stairwell. The dotted lines show the alternate route, and the solid lines refer to the primary route

Data set for training and testing the model

Empirical data set (D1)

The empirical dataset D1 comprises the data collected from 17 participants in Group 1. For brevity, the data from only two participants are shown in Table 3. Each predicate takes typed variables, so corresponding ground atoms are shown in the second and third columns of the table. The data set D1 is split into two parts. Based on the methodology in Fig. 1, the model in Table 2 was trained with different sizes of training/testing ratios, like 50/50, 60/40, 80/20. Eventually, an 80/20 split of D1 was found to produce good results. That is, 80% of the data in D1 was used for training the rules in Table 2, and 20% of the data was used here for testing the model.

Table 3 A sample of validation data for two participants, P1G1 and P2G1

Empirical dataset (D2)

The empirical dataset D2 comprises the data collected from all 19 participants in Group 2. Again based on the methodology in Fig. 1, different samples sizes were tried for partitioning the dataset D2; the 80/20 ratio for training and testing samples was used here.

Setting up the model

We consider close world assumption for all predicates except KETA, KETT, and KETPA. The predicates KETA, KETT, and KETPA employ open world assumption because these predicates are designed to be present in the model as a container for the background knowledge. KETA is true when the agent has knowledge about which alarm is for which emergency situation type, i.e., the fact that the GPA alarm sounds for the FIRE type emergency, and the PAPA alarm is activated for EVACUATE type. KETT is used to mean which type of threat is observed in an emergency. For example, a fire confined to a small area, at most, could mean to move to the primary muster station. Three types of threats are considered in this study. The threat smoke in the stairwell (SMK_STAI) should be recognized as a FIRE type emergency. If an agent sees smoke coming out of the mess hall vent (SMK_VENT), or the agent enters into the mess hall and sees smoke (SMK_MSHA) there, it means the situation is of type EVACUATE because the primary muster station is compromised. If KETT is true, it means that the agent knows the relationships between a threat and possible type of emergency situation that could originate from this threat. Similarly, the KETPA predicate is true if the agent knows which words in the PA would lead to a particular emergency type. For example, the sentences, “a fire in the galley” or “move to primary muster station” mean that the emergency type is FIRE. On the other hand, the words, “primary escape route is blocked” or “a fire has escalated” mean that the situation is EVACUATE. This knowledge was given to the participants of Smith’s experiment as part of the training curriculum. Therefore, during training of the model the truth values of KETA, KETT, and KETPA are taken as true to mean that the agents based on the proposed model have this background knowledge.

Calculating the model weights

We use the software package Alchemy 2.0 [2] for developing the proposed MLN model. The non-evidence predicates used for both D1 and D2 are R, HES and HSES. The model is trained separately for data sets D1 and D2 using a discriminative learning method so that weights can be assigned to the rules presented in Table 2. It was observed that some participants did not listen to an alarm even though it was audible. The use of Listens (L) as a predicate came up (see Table 2) with the empirical observations, where, with some participants the predicate takes a false value. On the other hand, if Hears were used instead of Listens, then there would not be any case with a false value for Hears because all the participants had hearing abilities in the normal range. Similar considerations were taken for other rules. Table 4 shows the weights. A portion of ground MN obtained by grounding the rules#2–5 is depicted in Fig. 4, which shows how the nodes corresponding to each predicate are related.

Table 4 Weights assigned to rules using datasets D1 and D2
Fig. 4
figure 4

A portion of ground MN obtained using grounding of the predicates in rules 2–5

Results and discussion

Querying the proposed MLN based model of agent SA is the same as querying a knowledgebase. We use the MC-SAT algorithm using the Alchemy inference engine for querying. Now if the model is used in an agent program as a part of situation assessment logic, the evidence would come via the available sensors. Given the evidence predicates, the agent can determine the chances that a query predicate is true in the present conditions. The most important things an agent seeks in an evolving emergency are the recognition of alarms and determination of the type of emergency it is in at a given time. For this reason, the query predicates are obtained by grounding the following predicates:

(5)

where, the predicate R is read as the agent, ag, recognizes an alarm, al, during the time interval t. HES means that the agent, ag, has an emergency, e, of type emgSitType, during time t, and the predicate HSES represents an agent, ag, who has got some sense of emergency. If in any case, the truth value of HSES is true and HES is false, it would mean that the agent is unable to determine the type of emergency despite that it has sensed the emergency situation. The predicates obtained after grounding the predicates listed in Table 2 other than the query predicates mentioned in (5) are used as part of the evidence predicates that need to be provided to the inference engine to obtain the results of the queries presented in (5).

Table 5 presents the probabilities estimated against the queries for the cases in the testing datasets. The test datasets were formed by taking 20% of the total samples from D1 and D2 respectively, as reported in “Data set for training and testing the model” section.

Table 5 Query results

With regards to the training and testing datasets for the model, the total duration each participant spends during a training or testing session has been divided into two intervals. The first is the interval t0 that starts from the beginning of a session until the time when the GPA alarm stops. The second interval is termed t1, which is the interval that follows immediately after t0 ends, and it ends at the end of each session. t0 covers the period when there is FIRE type emergency, and t1 covers the duration when there is EVACUATE type emergency. This division of time is important to assess the importance of cues relevant to each emergency type. For example, if an agent observes smoke in the central stairwell, then this is an important cue for FIRE type emergency because in that case, the agent should move to the primary muster station, the mess hall. On the other hand, smoke in the central stairwell should not be considered during t1, or when the PAPA alarm sounds, because PAPA alarm is a call to gather at the secondary, or alternative muster station, the LIFEBOAT station. Often in such cases, the primary muster station may have been compromised, or the routes that lead to the primary muster station may have been blocked.

Table 5 presents the results that are obtained for seven participants P1G1, P2G1, P3G1, P1G2, P2G2, P3G2, and P4G2. The names of these participants are kept hidden due to privacy. The information obtained by watching the replay videos and by observing the log files is divided into two columns with the view that those predicates that are used as part of the evidence in the inference algorithm are kept under the heading of evidence and those that are used to query the model are kept as empirical results. Both columns contain the empirical results obtained from Smith’s experiment. The truth values of the empirical results are used for validating the model output that is described as the last column in Table 5.

Simulation results against the participant P1G1

Now consider the case when the participant, P1G1, was tested in AVERT. The evidence predicates suggest that immediately after hearing the alarm, P1G1 developed the intention to move to the mess hall, the primary muster station, which was correct, but the participant spent more time than needed time and so reached the mess hall when t0 had already expired. On the other hand, this also means that P1G1 recognized the GPA alarm, R(P1G1, GPA, t0), and developed awareness about the FIRE situation, HES(P1G1, FIRE, t0), during the initial time interval t0. But as a slow mover, P1G1 observed the smoke in the stairwell, mess hall, and the smoke coming through the mess hall ventilation during t1. P1G1 also did not pay attention to the PAPA alarm, which is the reason for ¬L(P1G1, PAPA, t1), which was activated when P1G1 was still in the main stairwell. P1G1 took about 20 s more in t1, ignoring the fact that the PAPA alarm implies a re-route towards the lifeboat station through the secondary escape route. So, unnoticed from the PAPA alarm and the relevant PA, P1G1 entered the mess hall and saw thick smoke. Studies [47, 54] suggest that humans show dominance on visual information than on other types of sensory cues such as auditory information. Observing smoke drew the P1G1’s attention on smoke, and he instantly realized a need to move out of the mess hall, which was done by re-routing to the lifeboat. But this realization of the situation comes only when P1G1 saw smoke, and it was not due to the PAPA alarm or the relevant PA. In a real situation, entering an area filled with smoke due to fire or any other toxic element could be lethal. Also, observing a fire or smoke is a natural cue that would develop awareness about a fire situation. It is, nevertheless, hard to develop awareness about an evacuation situation by watching a fire or smoke unless the relevant alarms and/or platform announcements are heard and recognized. This is the reason why P1G1, although mustered at the lifeboat station, is considered to be poor in responding to the evacuation situation, and that is why we have ¬R(P1G1, PAPA, t1) and ¬HES(P1G1, EVACUATE, t1) in the empirical results for P1G1. Similarly, P1G1 spent a fraction of the interval t1 maintaining the impression of a fire situation, although the fire situation had already been escalated to an evacuation situation, which is why we have a predicate HES(P1G1, FIRE, t1) in the empirical results. The model output is probabilities obtained against the query predicates, as shown in the last column of Table 5.

Ideally, a high probability is a good fit for a queried predicate when the corresponding empirical result has a truth value of true. Similarly, a low output probability should serve a good fit for the queries predicate when its empirical truth value is false. This is very much evident for P1G1. Given the listed evidence for P1G1, the probability that an agent would recognize a GPA is 0.91, and the probability the same agent would get immediate fire emergency awareness is 0.92. However, there are fewer chances (only 16%) that the agent would respond to the escalating situation from FIRE to EVACUATE because the likelihood of recognition of the PAPA alarm is zero, as the agent does not listen to or has no focus on the sounding alarm. In any case, if we change the evidence truth value for the predicate 1.10 in Table 5 from false to true, the corresponding probability of recognizing PAPA during t1 would increase from 0.0 to 0.48. The reason for getting a zero probability is due to the hard constraint (rule#1) listed in Table 4. Similarly, if P1G1 realized the presence of smoke in the stairwell during t0 rather than t1, for example, if P1G1 had moved fast, then the chances for having a FIRE situation during t1 would have been lowered from 0.74 to 0.46, and the chances for getting awareness about the EVACUATE situation would be increased from 16 to 23% during t1. This is because the SMK_STAI, i.e., seeing smoke in the stairs, is a positive cue for a fire situation, but when one observes it in the presence of a cue that is for an evacuation situation, for example, a PAPA alarm, the two conflicting cues would cause confusion, and the agent needs to decide which cue should be considered. P1G1 preferred SMK_STAI during t1 over the PAPA alarm and so entered the mess hall, although this decision was wrong as it wasted egress time and exposed the participant to a hazard.

Simulation results against the participant P2G1

The case of participant P2G1 shows a slight deviation between the model output and the empirical results at only one place (see empirical result # 2.3 and corresponding model output probability in Table 5). The model output probability of keeping the impression of a fire situation, though the situation had turned into an evacuation situation, is a bit high (0.29) compared to the empirical result where the truth value of the involved predicate, HES(P2G1, FIRE, t1), was false. The rest of the model output probabilities, estimated for modeling P2G1’s behavior, are reasonable.

Simulation results against the participants P3G1 and P1G2

The only thing participant P3G1 took into consideration during t0 was the smoke coming out from the mess hall ventilation. P3G1 did not recognize the GPA alarm nor heed the PA for the FIRE emergency. P3G1 never had any intention to move to the mess hall. The model output for recognizing the GPA alarm (0.49) during t0 is reasonable because the time when the GPA starts sounding is the time when the participant is in the cabin, and there are no other available cues except the alarm sound and the relevant PA. The model output probabilities are in good agreement with the empirical results except for a slightly larger value of 0.44 for the probability of having awareness about FIRE emergency during t0, whereas P3G1 remained unaware about the fire emergency, and from the beginning of the scenario P3G1 had decided to muster at the LIFEBOAT station. The results obtained against the evidence for the participant P1G2 are all in good agreement with the empirical values.

Simulation results against the participant P2G2

By giving the evidence of P2G2, the model recognizes the fire alarm during t0 with 0.87 probability. P2G2 did not recognize the PAPA during the experiment, and the model output is 0.49 for the predicate R(P2G2, PAPA, t1). The reason for having a probability near 0.5 is that when the interval shifted from t0 to t1, there are only two cues suggesting that the situation has escalated from FIRE to EVACUATE (smoke from the vents and the smoke in the mess hall) and the smoke in the stairwell is a cue for moving to the mess hall. This is a conflicting situation. Moreover, as P2G2 moved into the mess hall while the PAPA alarm was still on along with the relevant PA, the predicate BST(P2G2, PAPA, t1) takes a false value in the evidence that reduced the probability of recognizing PAPA during t1 from 0.94 (if BST(P2G2, PAPA, t1) is true) to 0.49 when the predicate BST is false, as in the case of P2G2. Similar reasoning is true for recognizing the FIRE and EVACUATE situations during t1. If we set BST(P2G2, PAPA, t1) true in the evidence dataset for P2G2, then the new values for probabilities for having awareness about FIRE and EVACUATE situations during t0 and t1 come out to be 0.94 for a FIRE at t0 and 0.96 for EVACUATE at t1. This shows the importance of recognizing the alarm before seeing any real threat.

Simulation results against the participants P3G2 and P4G2

The participant P3G2 did not recognize the GPA alarm, and the model probability against the query predicate is 0.5 for similar reasons we observed in the case of P3G1. The rest of the results for P3G2, as reported in Table 5, support the empirical results for P3G2. Similar reasons are there for the results obtained against the query predicates for P4G2.

Conclusions

A MLN-based model of SA for agents in a VE is proposed in this work. The methodology used here involves assessing the environmental and cognitive factors, such as alarms, fire/smoke, intention, and focus of attention, for potential impact on awareness of emergencies. The proposed model has been used to represent two case studies that involve fire and evacuation situations on an offshore petroleum platform. The case studies were carried out in a VE with real people. Data obtained from the case studies are used to validate the model output. Empirical and simulated results agree in asserting the importance of alarm recognition and focus of attention for awareness about the emergency situations involving smoke and fire.

Endsley’s SA model describes how people get awareness about a situation, but it does not provide how such a model can be used for software agents [32]. The present work shows a potential approach to modeling SA for software agents. Agents based on this model can be used in several application areas. For example, one can exploit such agents so that different situations can be considered as different experiences, and hence a repertoire of situations can be made as a basis for decision-making regarding choosing actions in a given a situation. Virtual training environments are good examples of using such agents for cohort training where agents, based on the proposed methodology, can exhibit different behaviors in different situations for training purposes. Due to the inherent stochasticity of the proposed approach, the model is dynamic, and it has an advantage over other models, such as ontology-based SA models [32, 33, 37], and case-based SA models [44], in that it can recognize a situation even if some of the FOL rules violate.

This work has the potential to be used in Naturalistic Decision-Making (NDM) environments where situations are central entities to decision making [21]. Another application is in intelligent tutoring where the model can be used to make student models in a VE for training people for different tasks of SA. Different kinds of agents can be developed—even without using training–testing samples, by manually selecting weights [26]—for tutoring different behaviors. For example, an agent that has poor capabilities of recognizing alarms should use a real positive number near zero as a weight for rule#2. Similarly, an agent that acts as an expert should have high values of weights in the rules, and the evidence database should contain as much of the needed information as possible so that the agent acts as an expert in retrieving cues from the environment.