Remotely reconfigurable hardware–software platform with web service interface for automated video surveillance

https://doi.org/10.1016/j.sysarc.2013.05.007Get rights and content

Abstract

This paper presents a reconfigurable hardware platform for general-purpose video processing. The proposed design is a step towards portable, web-enabled devices which, unlike most existing smart cameras, are to a large degree autonomous and have substantial intelligence embedded. The device is based on programmable logic, microcontrollers, and dedicated communication modules. It is able to acquire an input frame using a built-in camera, process the image in a massively parallel manner, and expose its functionality as a web service compliant with the Service Oriented Architecture (SOA) paradigm. Thanks to the FPGA technology used for main functionality implementation, the system’s hardware can be locally or remotely reconfigured in order to optimize it on a very low level for user-defined tasks. Moreover, the image processing core is implemented using a C-to-HDL compiler to reduce time to market when deploying new functionalities. We demonstrate the capabilities of our device on two example applications: moving object detection and face recognition.

Introduction

In the recent years video surveillance systems have rapidly become integral to our environment. They are now a common element of media infrastructure in public and commercial buildings, urban streets and highways, as well as private properties. The main use of video surveillance systems is to enhance physical security in selected areas by facilitating visual detection of events, such as trespassing, acts of vandalism or other incidents of offensive or criminal character. However, they can as well be used as general-purpose systems for environment monitoring, e.g. access control, motion sensing, vehicle or pedestrians traffic density measurement, or automated billing of car park users.

The aim of this work is twofold: first, to present a novel complete FPGA-based architecture of smart camera that can be successfully used in the aforementioned surveillance systems for general-purpose environment monitoring, and second, to cast light on various aspects of its implementation. The proposed solution, described in the remainder of this paper, is motivated by the need of creating a reconfigurable and autonomous embedded device that, unlike the existing designs, would be able to handle relatively complex image processing tasks straight on chip and expose the processing results through a SOA-compliant web service.

The rest of this paper is composed as follows. The remainder of this section discusses the prior art on embedded video surveillance and briefly introduces our solution. In Section 2 the architecture of the proposed smart camera is presented. Section 3 describes the image data acquisition and the hardware and software infrastructure within which the data are processed. In Section 4 details of network communication subsystem are given. Section 5 discusses the mechanism of remote reconfiguration which allows the web services deployed on our device to be dynamically replaced to meet changing requirements. Notes on the achieved functionality of example services and their performance are given in Section 6. Finally, the paper is summarized in Section 7.

Aghajan and Cavallaro in [1] present a comprehensive introduction to multi-camera network systems. That work covers a broad spectrum of general computer vision issues while strongly stressing the collaboration aspects of multiple cameras at various abstraction layers. For instance, at the image processing level this collaboration may refer to building a digital model of the scene being monitored. On the other hand, at the system architecture level it may refer to general data processing paradigms, e.g. distributed or centralized. In this paper, we present a concept of “smart camera”, which is understood here as a node in a multi-camera network with distributed data processing intelligence. It is capable of image acquisition and processing and it exposes the results of this processing to third parties via its communication interface. As a core component of smart camera a high-performance microcontroller unit (MCU) or Field Programmable Gate Array (FPGA) chip can be used. Modern FPGAs allow the developer to implement on a single chip not only a whole microcontroller system, but also one or more hardware accelerators that can boost system performance dramatically. Numerous image processing algorithms have already been deployed on FPGAs (e.g. [2], [3], [4], [5], [6], [7], [8]), many of which can be optimized for various video surveillance applications. An FPGA-based smart camera design has another very important advantage over the microcontroller-based implementations – the FPGA can be flexibly reconfigured to provide the required capabilities. Its functionality is hence never fixed.

Classical architecture of a surveillance system features analog or digital cameras connected to a central unit (a surveillance server) which collects and processes the incoming video streams. The scalability and flexibility of such a system are usually limited due to the increasing amount of data which must be handled by the server with the growth in the number of connected cameras.

For example, H.264 compressed video stream with only 360p quality requires constant ≈90 KB/s network bandwidth to transmit data to a surveillance server. Since many transmitted video frames often do not contain any useful information, they may be discarded at the server side. In contrast, meta-information, which describe only the outcome of the processing (e.g. an identification number of a recognized face or coordinates of a detected object), can usually be packed into not more than 1 KB of raw data or less than 10 KB when considering the overhead of high-level network communication protocols, such as Simple Object Access Protocol (SOAP [9]). The required bandwidth is therefore not only much smaller, but it is also utilized once per detected set of objects. In many cases this allows to reuse the existing network infrastructure without costly, application-specific upgrades.

Another issue is that the image sensors almost always lack any intelligence and autonomy. Therefore, the server usually becomes a single point of failure as any breakdown immediately causes the entire surveillance system nonoperational.

Introduction of distributed processing is an interesting alternative to the classical approach. It is realized by doing computations as close as possible to the place where data are actually gathered and transmitting only the results over the network. This approach is called edge computing and has already been introduced [10], [11]. Distributed architectures conforming to that style are claimed to have better power efficiency, scalability, and fault tolerance. An in-depth system performance analysis and simulation results for smart camera networks following similar design pattern are provided in [12].

Compression-based approach to transmitting data in distributed digital video surveillance systems is commonly adopted to reduce the amount of transferred data. For example, Salem et al. in [13] present an FPGA-based camera in which compression is done using the multiresolution analysis technique. The successive coarser approximations of the interest regions of the processed image are computed using the wavelet transform.

In [14] Kandhalu et al. introduce a distributed, real-time video surveillance system in which the image processing tasks are done mostly locally at each node and include object and motion detection using frame-to-frame adaptive background subtraction. The image is then compressed using JPEG standard, tagged according to the detected anomalies, and sent using a specialized protocol to the base station. In this solution, the image transmission also relies on a compression algorithm.

An example implementation of fully-distributed surveillance network is presented by Latha and Bhagyaveni [15] where authors demonstrate a cooperating collection of wireless sensors enhanced with certain object detection capabilities. Each network node first detects an object using ultrasonic, Hall effect, and vibration sensors. Then, the node performs FPGA-enhanced image processing. Only the compressed image of the extracted potential intruder’s silhouette, without background information, is sent over the network in order to reduce the amount of data transmitted.

In [16] Benet et al. describe an embedded hardware platform for video surveillance purposes which is intended to act as a single node of audio and video sensor network. Each node has an ability to perform image acquisition, object tracking and labeling, classification and compression before sending the processing results to the network.

As shown, the existing embedded video surveillance solutions usually not only transmit the compressed data, but also the optional meta-data that encode the results of image content analysis, e.g. image coordinates of the detected objects. In general however, although they provide a degree of automation, they are unable to work without human supervision. Sending image data over the network is itself a severe drawback as it requires significant bandwidth compared to an alternative strategy where solely the meta-information about the image content is transmitted. Finally, typical video surveillance solutions tend to use specialized communication protocols which makes them hard to integrate with the external information systems.

Currently, one of the most widely accepted ways of integrating multiple network entities within a large system is via adopting the web service (WS) standards. In the field of FPGA-based systems this approach is still very uncommon. Important insights into how web services can be implemented in the reconfigurable architecture of FPGAs are given by Cuenca-Asensi et al. [17]. The key features of the presented system are high overall performance and a reconfiguration capability which allows the system to be rapidly updated to meet new requirements. Another FPGA-based web service implementation is described by Chang [18]. It is a RESTful service designed to control home appliances. An FPGA-based web server architecture aimed directly at high system performance is presented by Yu et al. [19]. There are also WS implementations which are based on embedded microcontrollers, such as those described by Lioupis and Stefanidakis [20] or Machado et al. [21]. The common problem with all above embedded web service implementations is that they offer very limited functionality, typically reduced to performing simple control tasks, such as Wake on LAN function for computers in a local network [17].

In this paper we propose a novel smart camera solution intended to be used in distributed video-based environment monitoring systems. Individual instances of our device or their networks can be integrated with the external systems that are compliant with the SOA paradigm.

The smart camera’s hardware is mainly implemented in FPGA in order to achieve design flexibility, high computational performance at relatively low clock frequencies, and low power consumption. The architectural decisions made are a compromise between enabling parallelization of computations whenever possible on one hand and ensuring ease of algorithm implementation and interconnectivity with the external systems on the other. Specifically, the device runs under control of an embedded soft-core microprocessor which executes a web service that can be specified using standard C++ code. This makes our design approachable to the developers without extensive hardware programming background. The server program implements the application logic, offloads execution of the core image processing algorithms to specialized hardware accelerators, and coordinates communication with the service clients. Moreover, it provides mechanism for the device to be reconfigured on demand, both locally (using standard JTAG interface) or remotely using its built-in, custom-made reconfiguration subsystem. No operating system is used.

As vast majority of image data manipulation is done solely in hardware, the proposed smart camera is to a large degree autonomous. Moreover, only the control data and the processing results are exchanged over the network. The former may include for instance various object detection confidence thresholds or region of interest (RoI) coordinate specifications, while the latter – coordinates of the detected objects or their predicted class labels. Adoption of this strategy leads to significant reduction of the amount of transferred data in comparison to the bulk of existing solutions where compressed video stream is sent over the network to the processing server. In addition, the web service interface allows our smart camera to interact with the clients using open, well-documented communication protocols, such as SOAP, which facilitates integration with various enterprise-class systems.

Section snippets

Hardware architecture

The proposed smart camera is an embedded system which consists of an integrated image sensor, fast memory modules, network interfaces and several other digital components connected to a high-end Stratix-II family FPGA from Altera. The Stratix-II FPGA was chosen because it is equipped with large quantity of general purpose combinational logic with registers, as well as specialized functional blocks: on-chip memories, clock management circuits, and DSP blocks. The large amount of FPGA resources

Data processing

In a default configuration of our smart camera we use a single complex hardware accelerator module controlled by the Nios-II microprocessor.2 The Nios-II was chosen because it is under constant development, offers very good configuration flexibility, extensive community support, a large set of ready-to-use peripherals, a widely used and well-documented Avalon interconnection fabric, and,

Network communication

Block diagram of the proposed smart camera’s network communication subsystem is shown in Fig. 6. The lower layers of the OSI protocol stack (up to TCP/IP) are implemented in the EM1206 and GA1000 Tibbo modules. They handle TCP connections and UDP datagrams, and run a specialized program that maintains internal communication with the Nios-II system. The HTTP layer, the SOAP layer and the server logic are implemented in the software running on the Nios-II system.

The Tibbo communication modules

Remote reconfiguration

As already mentioned, our platform supports remote reconfiguration. It allows the developer to remotely change or upgrade the FPGA configuration and Nios-II firmware. In the former case the system effectively starts offering a completely different service despite no changes made to the hardware. The reconfiguration feature also serves another purpose: uploading an arbitrary file to the smart camera’s memory card. Both tasks can be done using a dedicated, easy to use client application with

Functionality and results

As a feasibility study for the proposed smart camera we have designed and implemented two sample video processing applications and deployed them as web services. The first application is a simple moving object detector that detects abrupt temporal changes in the scene and records them in a log that can be queried. The second application implements a more advanced algorithm for object recognition and exposes the recognition history to the web service clients. Both applications are described

Conclusions

This paper presents the architecture of an FPGA-based hardware platform for implementing image processing cameras with substantial embedded intelligence and web service interface. The presented solution uses standardized, proven and open communication protocols in order to provide simplified integration with enterprise-class SOA systems. The architecture is flexible and generic enough to act, in its substantial part, as a reference design for other classes of intelligent embedded devices, e.g.

Acknowledgment

The research presented in this paper has been partially supported by the European Union within the European Regional Development Fund program No. POIG.01.03.01-00-008/08.

Robert Brzoza-Woch studied Electronics and Telecommunication with major in Sensors and Microsystems. He received his M.Sc. degree in 2009 from the University of Science and Technology in Kraków, Poland. Currently, he is a Ph.D. student at the Department of Computer Science, at the same university. He has broad experience in hardware and software design for intelligent wireless sensor networks, remote condition monitoring systems, and pervasive computing. His work is centered around FPGA

References (28)

  • A. Ruta et al.

    Learning pairwise image similarities for multi-classification using kernel regression trees

    Pattern Recognition

    (2012)
  • H. Aghajan et al.

    Multi-Camera Networks: Principles and Applications

    (2009)
  • M. Arias-Estrada et al.

    An FPGA co-processor for real-time visual tracking

  • D. Baumann, J. Tinembart, Designing mathematical morphology algorithms on fpgas: An application to image processing,...
  • R. Djemal et al.

    A real-time image processing with a compact FPGA-based architecture

    Journal of Computer Science

    (2005)
  • G. Cai, Y. Harada, An improved real-time moving object detecting system based on SOPC, in: Proceedings of the...
  • S. Jin et al.

    Design and implementation of a pipelined datapath for high-speed face detection using FPGA

    IEEE Transactions on Industrial Informatics

    (2012)
  • C. Desmouliers, E. Oruklu, J. Saniie, Fpga-based design of a high-performance and modular video processing platform,...
  • M. Brogioli, P. Radosavljevic, J.R. Cavallaro, A general hardware/software co-design methodology for embedded signal...
  • SOAP Version 1.2 Part 1: Messaging Framework (Second ed.) – W3C Recommendation, 2007....
  • A. Davis, J. Parikh, W.E. Weihl, Edgecomputing: extending enterprise applications to the edge of the internet, in:...
  • L. Nachman, J. Huang, J. Shahabdeen, R. Adler, R. Kling, Imote2: Serious computation at the edge, in: Proceedings of...
  • S. Hengstler, H. Aghajan, A. Goldsmith, Application-oriented design of smart camera networks, in: Proceedings of the...
  • M.A. Salem, K. Klaus, F. Winkler, B. Meffert, Resolution mosaic-based smart camera for video surveillance, in:...
  • Cited by (7)

    • Light-weight AI and IoT collaboration for surveillance video pre-processing

      2021, Journal of Systems Architecture
      Citation Excerpt :

      Wireless video surveillance systems nowadays perform as guardians in our daily life due to its easier installations and flexible infrastructures [1].

    • Research on embedded access control security system and face recognition system

      2018, Measurement: Journal of the International Measurement Confederation
      Citation Excerpt :

      In the application program, corresponding slot functions control opening of the access control lock. Brzoza-Woch presented a reconfigurable hardware platform for general-purpose video processing [15]. A hardware-software co-design was proposed to effectively utilize FPGA resources for a prototype of an automated video surveillance system on a programmable platform [16].

    • Secure and remote firmware update of cellular iot micro devices with limited resources

      2019, International Journal of Ad Hoc and Ubiquitous Computing
    • Rapid embedded systems prototyping-An effective approach to embedded systems development

      2018, Proceedings of the 2018 Federated Conference on Computer Science and Information Systems, FedCSIS 2018
    View all citing articles on Scopus

    Robert Brzoza-Woch studied Electronics and Telecommunication with major in Sensors and Microsystems. He received his M.Sc. degree in 2009 from the University of Science and Technology in Kraków, Poland. Currently, he is a Ph.D. student at the Department of Computer Science, at the same university. He has broad experience in hardware and software design for intelligent wireless sensor networks, remote condition monitoring systems, and pervasive computing. His work is centered around FPGA technology and microcontrollers with particular emphasis on ARM cores and FPGA-based microprocessor systems.

    Andrzej Ruta received his M.Sc. in computer science from the AGH University of Science and Technology, Krakow, Poland, in 2006. In 2009 he obtained the Ph.D. degree in machine vision at the School of Information Systems, Computing & Mathematics, Brunel University, Uxbridge, United Kingdom. Until 2012 he held a position of Assistant Professor at the Department of Computer Science, AGH-UST. He now works in Warsaw’s R&D center of Samsung Electronics Poland as an expert in machine learning and pattern recognition. He is an author of numerous publications in these areas. His current research interests include computer vision, machine learning, pattern recognition, information retrieval, predictive analytics and biosignal processing.

    Krzysztof Zieliński is Full Professor at the Department of Computer Science at the University of Science and Technology. His research focuses on networking, mobile and wireless systems, distributed computing, and service-oriented distributed systems engineering. He was a Project/Task Leader in numerous EU-funded projects, e.g. PRO-ACCESS, 6WINIT, Ambient Networks. He worked as an expert for the Ministry of Science and Education. Currently, he is leading SOA-oriented research performed by the IT-SOA Consortium in Poland. He is an active member of IEEE, ACM and Polish Academy of Science. He worked as a program committee member, chairman and organizer of several international conferences including MobiSys, ICCS, ICWS, IEEE SCC and many others.

    View full text