In recent years, video and image analysis tools have been increasingly employed in many real-time applications; these include lane and car recognition for intelligent transportation systems, human object segmentation and tracking for intelligent video surveillance systems, and face detection and image indexing for digital still cameras and camcorders. To implement these analysis tools in real-time applications, new computing architectures such as reconfigurable architectures, application-specific instruction-set processors, stream processing architectures, and dedicated processing elements have been developed to handle more complex real-time content analysis. It is often necessary to integrate specially designed hardware accelerators with other processors to achieve a high processing speed. Furthermore, new algorithms that are suitable for hardware design or implementation on existing architectures play important roles in such systems. The purpose of this special issue is to report on new hardware design ideas to support these video and image analysis tools.

This special issue contains two parts. The first part describes computing platform design, including vision processors and memory sub-system design. The second part describes several design case studies of image and video analysis systems, including machine learning engines and video segmentation engines.

The first part begins with two general-purpose vision processors with a single-instruction-multiple-data (SIMD) architecture that have been developed in the industry. These two new-generation designs both feature enhanced capabilities for processing higher-level video analysis tasks with different schemes. In “IMAPCAR: A 100 GOPS In-vehicle Vision Processor Based on 128 Ring Connected 4-Way VLIW Processing Elements,” Kyo and Okazaki design an in-vehicle vision processor with 128 8-bit four-way very-long-instruction-word (VLIW) RISC processing element (PE) array architecture. As compared to their previous design, IMAP-CE, the new design realizes 2.5 times better performance via the improved video I/O flexibility and data remapping structure, addition of one MAC unit per PE, and increased reliability of memory structure.

In “Xetal-II: A Low-Power Massively-Parallel Processor for Video Scene Analysis,” Abbo, Kleihorst, and Schueler design a 140 GOPS image processor with a massively parallel SIMD (MP-SIMD) architecture comprising 320 PEs arranged as a linear processor array. To support region-based processing, it provides a low-cost look-up table (LUT) and flag aggregation and flag-based result selection.

In addition to computation engines, the design of memory sub-systems also plays an important role in a video and image analysis system. In “Streaming Data Movement for Real-Time Image Analysis,” López-Lagunas and Chai propose the notion of stream descriptors as a means to define image stream access patterns and to improve memory access efficiencies by discovering the locality between different data streams. Examples are provided with a Reconfigurable Streaming Vector Processor (RSVPTM), and the design concept can be widely applied on different computing platforms such as ASIC and reconfigurable hardware.

The second part describes four case studies that target different computing platforms: FPGA, multi-core processor, reconfigurable processor, and hybrid computing platform.

Machine learning algorithms are widely employed in many image and video analysis applications. In “Accelerating Machine-Learning Algorithms on FPGAs Using Pattern-Based Decomposition,” Nagarajan et al. propose a pattern-based decomposition design approach in which several computation patterns and communication patterns are defined, and designers can design hardware using these patterns to increase design productivity via design reusability. Some machine learning algorithms such as multi-dimensional probability density function estimation using Gaussian kernels, K-means clustering, and correlation are implemented on an FPGA as examples.

Video object segmentation is a key operation for content-based video analysis and object recognition. One popular approach is to compare the input frame with a background model. In “Real-time Adaptive Background Modeling for Multi-Core Embedded Systems,” Apewokin et al. propose an adaptive background model called the multimodal mean that achieves faster execution and requires lower storage than a mixture of Gaussians. The partition of this algorithm on an embedded multi-core computing platform is also discussed.

In order to accelerate different video object segmentation algorithms, in “Reconfigurable Morphological Image Processing Accelerator for Video Object Segmentation,” Chien and Chen design a reconfigurable morphological PE array that contains a reconfigurable datapath for morphology operations in each PE and a programmable interconnection unit between PEs. The configuration of this reconfigurable accelerator can be defined by a firmware for different applications. A prototype chip is also designed to show the cost efficiency of this design.

This special issue concludes with “System Level Design and Implementation for Region-of-Interest Segmentation,” where Tsai et al. design a complete video object segmentation system on a board composing of an FPGA and a DSP. In this work, several system design tasks are discussed, including computationally efficient algorithm design, task partition, and system optimization techniques.

We would like to thank all the contributors of this special issue for their excellent works and the reviewers for their constructive comments, and we would also like to thank Professor Sun-Yuan Kung, Editor-in-Chief, for providing us this opportunity to collect these works in a special issue. It is expected that image and video analysis will play an increasingly important role in multimedia applications. With these representative papers in this special issue, including several processors designed by the industry and new ideas and concepts proposed by the academia, we hope to see the commencement of more related research works in the future.