Original papersMulti-level learning features for automatic classification of field crop pests☆
Introduction
Currently, the manual categorization and identification of pest species by expert entomologists are facing great challenges due to the vast number of pest species in the world. It is partly because the pest identification task is time-consuming and requires expert knowledge of field crops. The thorough understanding of pest species requires the terminology of insect taxonomy and morphological characteristics. Therefore, it is difficult to discriminate pest categories at the species level, which leads to the increase of crop losses or the misuse/overuse of pesticides. Along with the development of computer vision and pattern recognition techniques, automated pest classification has attracted a great deal of attention in recent years and has been widely used in many fields, such as agricultural engineering (Zhao et al., 2012), entomological science (Weeks et al., 1999), and environmental science (Larios et al., 2008). However, it is a challenging task since pest species exhibit large variations. Conventional pest classification methods (Weeks et al., 1999, Russell et al., 2005, Arbuckle et al., 2001, Wen et al., 2009) that were developed with shallow learning (e.g., Support Vector Machines, PCA, Boosting, and Logistic Regression) usually worked well only for the cases of good pest images, such as in uniform illumination, consistent scales or positions of the pests in the images, and similar poses or rotations of the pests. Here, we are focusing on the automatic classification of field crop pests whose images were collected from actual field circumstances, which requires that the identification algorithms be highly robust to various challenges of pest appearances, such as backgrounds, illumination changes, scale and pose changes. The cases of the challenges can be found in supplementary file 1.
Many methods on pest appearance modeling were proposed to address the challenges, including concatenated features of local appearance modeling (Larios et al., 2008, Wen and Guyer, 2012, Wang et al., 2012), scale invariant feature modeling (Solis-Sánchez et al., 2011), shape features using quality threshold ARTMAP modeling (Yaakob and Jain, 2012) and, most recently, sparse representation modeling (Xie et al., 2015). Yalcin (2015) tried to discriminate and classify the insects in the pheromone traps under challenging illumination and environmental conditions, with features extracted by the use of Hu moments (Hu), Elliptic Fourier Descriptors (EFD), Radial Distance Functions (RDF) and Local Binary Patterns (LBP). Most of the approaches for pest appearance modeling typically rely on either raw image patches or hand-designed image features (e.g., SIFT features, Lowe, 1999). Since raw pixels or image patches are sensitive to noise and background clutter for a natural image, it is hard to cope with the challenge of appearance variations. To model complex real-world pest appearance for automatic classification, robust and distinctive feature descriptors are required to capture the relevant information of pest appearances. Hand-crafted features, such as SIFT and HOG, have made drastic progresses in many vision tasks, such as object recognition and image matching. They are considered as one milestone in computer vision since they have passed the test of time for good performance. Although hand-designed features are effective for capturing low-level image features, it is difficult to find the appropriate representations of mid and high-level features, such as object parts, which are essentially important for representing images. Moreover, the hand-designed features are also criticized for weaknesses, such as large computational burdens and being incapable of properly accommodating appearance variations among pest species. So far, many feature descriptors have been developed for different data sets and tasks, which lack the generalization ability to cope with the appearance variations in different applications. In addition, the hand-crafted features typically require the domain knowledge of the related images for different application scenarios.
Recently, some researchers have focused on multi-level learning features that were extracted from a large amount of unlabeled images by the use of unsupervised machine learning. The results have shown that unsupervised feature learning models outperformed hand-crafted feature representations in many artificial intelligence domains, such as visual recognition (Bo et al., ,2013, Le, 2013), natural language processing (Mikolov et al. 2013) and many more. Typical unsupervised feature learning or deep learning approaches can be divided into four categories: sparse coding (Wright et al., 2010), convolutional neural networks (Schmidhuber, 2015), restricted Boltzmann machines (Hinton, 2010), and autoencoders (Zhou et al., 2012). This work mainly aims to design a robust feature learning model that confronts the aforementioned challenges by the use of sparse coding. Particularly motivated by the successes of the works such as Coates et al., 2011, Coates and Ng, 2011, Bo et al., ,2013, this paper adopts the deeply learned features into a multi-level classification framework for the automatic classification of field crop pests.
The outline of our model is illustrated in Fig. 1. First, a dictionary is trained from a large amount of unlabeled image patches using unsupervised feature learning methods. Second, the low-level features (namely, sparse coding) are computed from many labeled pest image patches by the learned dictionary. Third, the low-level features are then spatially alignment-pooled to form patch-level features using a multi-level operating strategy. Finally, a multi-level classification framework is constructed by learning the multiple patch-level features of the labeled samples for pest categorization and recognition.
This work is closely related to that of Xie et al. (2015). There are major differences between our work and Xie et al. (2015), although they both used unsupervised feature learning to obtain the learning dictionary. On the one hand, instead of using raw features (e.g., colors, shapes, and textures) as image descriptors, our proposed deep features are learned from a large scale of small image patches, which are randomly extracted from natural images. Moreover, this work applies multiple levels of pest image representations to identify pest species. On the other hand, Xie et al. (2015) considered the sparse-coding histograms of pest species as their features and ignored the spatial structural information in pest images.
The main contributions of this paper are as follows:
- •
a highly discriminative and robust pest object representation with multi-level learning features,
- •
a multi-level classification framework with alignment-pooled features using a multi-level operating strategy, and
- •
a large pest dataset of 40 categories with high quality that were labeled by agricultural experts.
Section snippets
Dataset collection
We collected approximately 4500 pest images covering most of the species found in several common field crops, including corn, soybean, wheat, and canola. Most of these pest images were captured under real conditions in several experimental fields of the Anhui Academy of Agricultural Sciences in China. Our pest dataset (D0) contains 40 different pest species. The details of the dataset are listed in Table 1, and some typical images are shown in Fig. 5. All images were captured by the use of
Results
Experiments were conducted to evaluate the performance of our proposed method and to analyze the impact of parameters in the method. Four data sets including ours (D0) were used in this study for comparison. The first data set (D1) is from the Butterfly database (Xiao et al. 2012), the second one (D2) is a pest data set from Xie et al., 2015, and the last one (D3) is a large data set from Wang et al. 2012. D1 included 1440 butterfly samples. The D2 contained 24 insect species and approximately
Discussions
In this section, the effects of the parameters, such as dictionary learning methods, dictionary sizes, patch sizes and the number of patches, on classification performance are analyzed for inset images.
Conclusions
This paper proposed an unsupervised method with the use of multi-level learning features for the automatic classification of field crop pests. A new multiple-level fusion classification framework was designed that was naturally aligned with the multi-level deep feature learning model. Experimental results showed that the method with multi-level learning features can work better than that with widely used hand-crafted features, such as the SIFT and HOG, on the automatic classification of field
Acknowledgment
This work was supported by the National Natural Science Foundation of China (Nos. 31401293, 61672035, 61773360, and 61300058).
References (28)
Deep learning in neural networks: an over view
Neural Netw
(2015)- et al.
Scale invariant feature approach for insect monitoring
Comput. Electron. Agric.
(2011) - et al.
A new automatic identification system of insect images at the order level
Knowl.-Based Syst.
(2012) - et al.
Species identification of wasps using principal component associative memories
Image Vis. Comput.
(1999) - et al.
Local feature-based identification and classification for orchard insects
Biosyst. Eng.
(2009) - et al.
Image-based orchard insect automated identification and classification method
Comput. Electron. Agric.
(2012) - et al.
Combined blur, translation, scale and rotation invariant image recognition by Radon and pseudo-Fourier–Mellin transforms
Pattern Recogn.
(2012) - et al.
Automatic classification for field crop insects via multiple-task sparse representation and multiple-kernel learning
Comput. Electron. Agric.
(2015) - et al.
A novel algorithm for damage recognition on pest-infested oilseed rape leaves
Comput. Electron. Agric.
(2012) - et al.
K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation
IEEE Trans. Signal Process.
(2006)
Cited by (0)
- ☆
Availability: http://www2.ahu.edu.cn/pchen/web/DLFautoinsects.htm.