Elsevier

Pattern Recognition Letters

Volume 32, Issue 3, 1 February 2011, Pages 423-431
Pattern Recognition Letters

Abnormality detection using low-level co-occurring events

https://doi.org/10.1016/j.patrec.2010.10.008Get rights and content

Abstract

We propose in this paper a method for behavior modeling and abnormal events detection which uses low-level features. In conventional object-based approaches, objects are identified, classified, and tracked to locate those with suspicious behavior. We proceed directly with event characterization and behavior modeling using low-level features. We first learn statistics about co-occurring events in a spatio-temporal volume in order to build the normal behavior model, called the co-occurrence matrix. The notion of co-occurring events is defined using mutual information between motion labels sequences. Then, in the second phase, the co-occurrence matrix is used as a potential function in a Markov random field framework to describe, as the video streams in, the probability of observing new volumes of activity. The co-occurrence matrix is thus used for detecting moving objects whose behavior differs from the ones observed during the training phase. Interestingly, the Markov random field distribution implicitly accounts for speed, direction, as well as the average size of the objects without any higher-level intervention. Furthermore, when the spatio-temporal volume is sufficiently large, the co-occurrence distribution contains the average normal path followed by moving objects. Our method has been tested on various indoor and outdoor videos representing various challenges.

Research highlights

► A method for behavior modeling and abnormal events detection which uses lowlevel features. ► The notion of co-occurring events is defined using Mutual Information. ► The co-occurrence matrix is used as a potential function in a Markov random field framework to describe the probability of observing new volumes of activity. ► The Markov random field implicitly accounts for speed, direction and average size of the objects without any higher-level intervention.

Introduction

In this paper, we present a low-level location-based approach for activity analysis and abnormal detection. In several traditional approaches (e.g. Hu et al., 2004), moving objects are first detected, analyzed and then tracked. Subsequently, behavior models are built based on object tracks and non-conformant ones are deemed abnormal. The main problem with this approach is that in case of complex environments, object extraction and tracking are performed directly on cluttered raw video or motion labels. We propose performing activity analysis and abnormal behavior detection first, followed possibly by object extraction and tracking. If the abnormal activity is reliably identified, then object extraction and tracking focus on region of interest (ROI) and thus is relatively straightforward. A question arises: How to reliably identify abnormalities from a raw video?

Some approaches have been proposed to perform such low-level abnormality detection (Adam et al., 2008, Jodoin et al., 2008). Nevertheless, we point out that these methods process each pixel independently and thus ignore spatial correlation across space and time. These correlations may not only be important in improving false alarms and misses but also in detecting abnormality of event sequences, such as a person in the act of dropping a baggage or a car making an illegal u-turn, etc. In our method, we account for these scenarios through spatio-temporal models. Although this model is simple, it nonetheless produces interesting results.

Section snippets

Previous work

Video analytics can be divided into two broad families of approaches namely shape/pattern-recognition-based methods and the machine-learning-based methods. The shape/pattern recognition approaches are typically those for which the type of activity or object is known a priori. Examples of such methods include facial recognition systems (Zhao et al., 2003, Hu et al., 2009), restricted-area access detection (Konrad, 2005), car counting (Friedman and Russell, 1997), detection of people carrying

Context

Although many video analytics methods use motion labels only in early stages of processing (mainly to locate moving objects) we argue that they carry fundamental information on the content of the scene and thus, can be used to perform high-level tasks. Motivated by this perspective, some authors have already shown that low-level motion labels can be used to summarize videos (Pritch et al., 2008), recognize human movements (Bobick and Davis, 2001) and detect abnormalities (Jodoin et al., 2008).

Our method

In this section, we present how, for a given site s, a co-occurrence matrix and its associated statistical model can be estimated from a training video sequence. Our statistical model is a Markov-random field (MRF) model that accounts for the likelihood of the co-occurrences. Since we account for normal scenarios in which objects follow typical paths, these paths manifest themselves as spatio-temporal dependencies across pixels as shown in Eq. (1). Our location-based approach for modeling

Experimental results

We present in this section some results obtained on various indoor and outdoor sequences representing different challenges. For each sequence, a co-occurrence matrix of size ranging between 130 × 70 ×  300 and 210 × 210 × 150 have been used. The size of the co-occurrence matrix is chosen so that a typical normal activity is entirely included in the volume. The reader shall note that since the matrix’ size stay fix for the entire process, it has to be fixed only once while setting up the system. The

Conclusion

We propose in this paper a method to perform behavior modeling and abnormality detection based on low-level characteristics. We use the spatial and temporal dependencies between motion labels vectors obtained with simple background subtraction. To do this, we built a Markov Random Field model parameterized by a co-occurrence matrix. Although simple, this matrix contains the average behavior observed in a training sequence. It also implicitly contains information about direction, speed and size

References (31)

  • T. Chen et al.

    Computer vision workload analysis: case study of video surveillance systems

    Intel Technology Journal

    (2005)
  • E. Ermis et al.

    Activity based matching in distributed camera network

    IEEE Transactions on Image Processing

    (2010)
  • N. Friedman et al.

    Image segmentation in video sequences: a probabilistic approach

    International conference on Uncertainty in Artificial Intelligence

    (1997)
  • I. Haritaoglu et al.

    W4: Real-time surveillance of people and their activities

    Transactions on Pattern Analysis and Machine Intelligence

    (2000)
  • W. Hu et al.

    A survey on visual surveillance of object motion and behaviors

    Transactions on System Man and Cybernetics – Part C: Applications and Reviews

    (2004)
  • Cited by (0)

    View full text