Semantic region of interest and species classification in the deep neural network feature domain
Introduction
Wildlife monitoring with camera-traps allows us to collect data at large scales in space and time to study the impact of climate changes, land-use, and human actions on wildlife population dynamics, and biodiversity (Kays et al., 2011). Camera-traps are stationary camera-sensor systems attached to trees in the field. Triggered by animal motion with on-board infrared motion sensors, they capture short image sequences of the animal appearance and activities associated with other sensor data, such as light level, moisture, temperature, and GPS sensor data. Camera traps are now being extensively used in wildlife monitoring due to their relatively low cost, rapid deployment, and easy maintenance (He et al., 2016). From camera-trap images, we can extract useful information such as animal species, motion, appearance, and biometric features (Kays et al., 2014). They are widely used in ecological research to track animal movements (Silveira et al., 2003), study habitat use (Bowkett et al., 2008), assess species behaviors and population dynamics (Karanth et al., 2006), and identify new species (Rovero and Menegon, 2008).
In this work, we focus on automatic animal species recognition in highly cluttered camera-trap images using deep learning methods. Fig. 1 shows some samples of camera-trap images with animals often appearing in a relatively small region of the highly cluttered wooded scene. We recognize that it is not efficient to detect and classify animal species directly on the whole image using image classification or object detection approaches. Instead, it is highly desirable to first locate the animal region in the large image and then perform animal species classification on this smaller image region. In this way, we expect that the animal classification will be more accurate and robust since the interference from the cluttered background is suppressed.
Deep neural networks (DNN) have emerged as a powerful method for image representation in various computer vision and machine learning tasks, such as object detection and classification (Oquab et al., 2014; Razavian et al., 2014). They provide a rich hierarchical set of learned visual features, from low-level pixel statistics to high-level semantic features. In this paper, we propose to explore how this DNN visual representation could be used for semantic animal region detection and species classification in challenging natural scenes. Specifically, we first design and train a DNN for animal-background object classification, which is used to analyze the input image to generate multi-layer feature maps, representing the responses of different image regions to the animal-background classifier. In this DNN feature domain, we perform clustering and graph cut to construct the semantic regions of animals. We then perform animal species classification on these semantic regions. Our experimental results demonstrate that the proposed method significantly outperforms existing classification and detection methods.
The main contributions of this work include: (1) we have developed a new approach for representing camera-trap image using semantic regions and detecting semantic regions of interest for animals in highly cluttered natural scenes in the learned DNN feature domain. (2) We have proposed a method to identify semantic regions of interest for more accurate image classification. (3) We have achieved accurate classification of animal species on a challenging dataset, outperforming existing methods.
The rest of paper is organized as follows. Section 2 reviews the related work. Section 3 presents our animal species classification method using semantic region of interest. Experimental results are presented in Section 4. Further discussions are provided in Section 5. Section 6 concludes the paper.
Section snippets
Related work
A number of computer vision and machine learning methods have been developed for animal object detection and classification. Linear support vector machine (SVM) was used by Yu et al. (2013) to classify 18 species of animals on a dataset with over 7000 images. They used sparse coded spatial pyramid (ScSPM) that generates global features and extracts dense SIFT (scale-invariant feature transform) (Lowe, 2004) descriptor and cell-structured local binary patterns (cLBP) as the local features. A
Animal species recognition using semantic region of interest
In this section, we present the proposed method for animal species recognition using semantic region detection in the DNN feature domain.
Experimental results
In this section, we show the performance of our algorithm to classify animal regions into species. First, we used AlexNet architecture to train the two-class animals and background DCNN model. We achieved 99.75% of accuracy in this task. Second, we apply our semantic segmentation algorithm to find the animal region then we use the animal vs background DCNN model to suppress the background regions. We train another DCNN model with 26 classes to classify the animal regions into their species. To
Further discussions
During our experiments, we found that many animals share features such as colors, lines, horns, furs, etc., that make it very difficult to classify similar species. Fig. 12 shows some examples of classification errors with rows (a), (c), and (e) showing the original images of animal species classified into different species in rows (b), (d), and (f). We can see that classification errors for species caused by many factors such as poor illumination, unexpected animal poses, heavy occlusions,
Conclusions
In this paper, we have successfully developed an animal species classification method for camera-trap images with highly cluttered scenes. We use the DNN trained for animal-background classification to analysis the input image and construct a semantic region representation using k-mean clustering and graph cut in the DNN domain. With semantic animal regions being detected, we trained a DNN model to perform animal species classification on these regions. Our experimental results on the
Acknowledgements
This work was supported in part by NSF grant CyberSEES-1539389.
References (35)
- et al.
Animal classification system: a block based approach
Procedia Comput. Sci.
(2015) - et al.
Classification of wild animals based on SVM and local descriptors
AASRI Procedia
(2014) - et al.
Camera trap, line transect census and track surveys: a comparative evaluation
Biol. Conserv.
(2003) - et al.
The use of camera-trap data to model habitat use by antelope species in the udzungwa mountain forests, Tanzania
Afr. J. Ecol.
(2008) - et al.
An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision
IEEE Trans. Pattern Anal. Mach. Intell.
(2004) - et al.
Fast approximate energy minimization via graph cuts
IEEE Trans. Pattern Anal. Mach. Intell.
(2001) - et al.
Deep convolutional neural network based species recognition for wild animal monitoring
- et al.
Data augmentation for deep neural network acoustic modeling
- et al.
Scalable object detection using deep neural networks
- et al.
An Introduction to Classification and Clustering, in Cluster Analysis
(2011)
Fine-grained bird species recognition via hierarchical subset learning
Fast R-CNN
Animal identification in low quality camera-trap images using very deep convolutional neural networks and confidence thresholds
Towards automatic wild animal monitoring: identification of animal species in camera-trap images using very deep convolutional neural networks
Ecol. Inform.
Deep residual learning for image recognition
Visual informatics tools for supporting large-scale collaborative wildlife monitoring with citizen scientists
IEEE Circ. Syst. Mag.
Assessing tiger population dynamics using photographic capture–recapture sampling
Ecology
Cited by (5)
Performance Enhancement of Animal Species Classification Using Deep Learning
2022, Communications in Computer and Information ScienceNext-Generation Camera Trapping: Systematic Review of Historic Trends Suggests Keys to Expanded Research Applications in Ecology and Conservation
2021, Frontiers in Ecology and EvolutionAnimal Breed Classification and Prediction Using Convolutional Neural Network Primates as a Case Study
2021, 2021 4th International Conference on Electrical, Computer and Communication Technologies, ICECCT 2021