Original papers
Multi-level learning features for automatic classification of field crop pests

https://doi.org/10.1016/j.compag.2018.07.014Get rights and content

Highlights

  • A highly discriminative and robust pest object representation with multi-level learning features.

  • A multi-level classification framework with alignment-pooled features using a multi-level operating strategy.

  • A large pest dataset of 40 categories with high quality that were labeled by agricultural experts.

Abstract

The classification of pest species in field crops, such as corn, soybeans, wheat, and canola, is still challenging because of the tiny appearance differences among pest species. In all cases, the appearances of pest species in different poses, scales or rotations make the classification more difficult. Currently, most of the classification methods relied on hand-crafted features, such as the scale-invariant feature transform (SIFT) and the histogram of oriented gradients (HOG). In this work, the features of pest images are learned from a large amount of unlabeled image patches using unsupervised feature learning methods, while the features of the image patches are obtained by the alignment-pooling of low-level features (sparse coding), which are encoded based on a predefined dictionary. To address the misalignment issue of patch-level features, the filters in multiple scales are utilized by being coupled with several pooling granularities. The filtered patch-level features are then embedded into a multi-level classification framework. The experimental results on 40 common pest species in field crops showed that our classification model with the multi-level learning features outperforms the state-of-the-art methods of pest classification. Furthermore, some models of dictionary learning are evaluated in the proposed classification framework of pest species, and the impact of dictionary sizes and patch sizes are also discussed in the work.

Introduction

Currently, the manual categorization and identification of pest species by expert entomologists are facing great challenges due to the vast number of pest species in the world. It is partly because the pest identification task is time-consuming and requires expert knowledge of field crops. The thorough understanding of pest species requires the terminology of insect taxonomy and morphological characteristics. Therefore, it is difficult to discriminate pest categories at the species level, which leads to the increase of crop losses or the misuse/overuse of pesticides. Along with the development of computer vision and pattern recognition techniques, automated pest classification has attracted a great deal of attention in recent years and has been widely used in many fields, such as agricultural engineering (Zhao et al., 2012), entomological science (Weeks et al., 1999), and environmental science (Larios et al., 2008). However, it is a challenging task since pest species exhibit large variations. Conventional pest classification methods (Weeks et al., 1999, Russell et al., 2005, Arbuckle et al., 2001, Wen et al., 2009) that were developed with shallow learning (e.g., Support Vector Machines, PCA, Boosting, and Logistic Regression) usually worked well only for the cases of good pest images, such as in uniform illumination, consistent scales or positions of the pests in the images, and similar poses or rotations of the pests. Here, we are focusing on the automatic classification of field crop pests whose images were collected from actual field circumstances, which requires that the identification algorithms be highly robust to various challenges of pest appearances, such as backgrounds, illumination changes, scale and pose changes. The cases of the challenges can be found in supplementary file 1.

Many methods on pest appearance modeling were proposed to address the challenges, including concatenated features of local appearance modeling (Larios et al., 2008, Wen and Guyer, 2012, Wang et al., 2012), scale invariant feature modeling (Solis-Sánchez et al., 2011), shape features using quality threshold ARTMAP modeling (Yaakob and Jain, 2012) and, most recently, sparse representation modeling (Xie et al., 2015). Yalcin (2015) tried to discriminate and classify the insects in the pheromone traps under challenging illumination and environmental conditions, with features extracted by the use of Hu moments (Hu), Elliptic Fourier Descriptors (EFD), Radial Distance Functions (RDF) and Local Binary Patterns (LBP). Most of the approaches for pest appearance modeling typically rely on either raw image patches or hand-designed image features (e.g., SIFT features, Lowe, 1999). Since raw pixels or image patches are sensitive to noise and background clutter for a natural image, it is hard to cope with the challenge of appearance variations. To model complex real-world pest appearance for automatic classification, robust and distinctive feature descriptors are required to capture the relevant information of pest appearances. Hand-crafted features, such as SIFT and HOG, have made drastic progresses in many vision tasks, such as object recognition and image matching. They are considered as one milestone in computer vision since they have passed the test of time for good performance. Although hand-designed features are effective for capturing low-level image features, it is difficult to find the appropriate representations of mid and high-level features, such as object parts, which are essentially important for representing images. Moreover, the hand-designed features are also criticized for weaknesses, such as large computational burdens and being incapable of properly accommodating appearance variations among pest species. So far, many feature descriptors have been developed for different data sets and tasks, which lack the generalization ability to cope with the appearance variations in different applications. In addition, the hand-crafted features typically require the domain knowledge of the related images for different application scenarios.

Recently, some researchers have focused on multi-level learning features that were extracted from a large amount of unlabeled images by the use of unsupervised machine learning. The results have shown that unsupervised feature learning models outperformed hand-crafted feature representations in many artificial intelligence domains, such as visual recognition (Bo et al., ,2013, Le, 2013), natural language processing (Mikolov et al. 2013) and many more. Typical unsupervised feature learning or deep learning approaches can be divided into four categories: sparse coding (Wright et al., 2010), convolutional neural networks (Schmidhuber, 2015), restricted Boltzmann machines (Hinton, 2010), and autoencoders (Zhou et al., 2012). This work mainly aims to design a robust feature learning model that confronts the aforementioned challenges by the use of sparse coding. Particularly motivated by the successes of the works such as Coates et al., 2011, Coates and Ng, 2011, Bo et al., ,2013, this paper adopts the deeply learned features into a multi-level classification framework for the automatic classification of field crop pests.

The outline of our model is illustrated in Fig. 1. First, a dictionary is trained from a large amount of unlabeled image patches using unsupervised feature learning methods. Second, the low-level features (namely, sparse coding) are computed from many labeled pest image patches by the learned dictionary. Third, the low-level features are then spatially alignment-pooled to form patch-level features using a multi-level operating strategy. Finally, a multi-level classification framework is constructed by learning the multiple patch-level features of the labeled samples for pest categorization and recognition.

This work is closely related to that of Xie et al. (2015). There are major differences between our work and Xie et al. (2015), although they both used unsupervised feature learning to obtain the learning dictionary. On the one hand, instead of using raw features (e.g., colors, shapes, and textures) as image descriptors, our proposed deep features are learned from a large scale of small image patches, which are randomly extracted from natural images. Moreover, this work applies multiple levels of pest image representations to identify pest species. On the other hand, Xie et al. (2015) considered the sparse-coding histograms of pest species as their features and ignored the spatial structural information in pest images.

The main contributions of this paper are as follows:

  • a highly discriminative and robust pest object representation with multi-level learning features,

  • a multi-level classification framework with alignment-pooled features using a multi-level operating strategy, and

  • a large pest dataset of 40 categories with high quality that were labeled by agricultural experts.

Section snippets

Dataset collection

We collected approximately 4500 pest images covering most of the species found in several common field crops, including corn, soybean, wheat, and canola. Most of these pest images were captured under real conditions in several experimental fields of the Anhui Academy of Agricultural Sciences in China. Our pest dataset (D0) contains 40 different pest species. The details of the dataset are listed in Table 1, and some typical images are shown in Fig. 5. All images were captured by the use of

Results

Experiments were conducted to evaluate the performance of our proposed method and to analyze the impact of parameters in the method. Four data sets including ours (D0) were used in this study for comparison. The first data set (D1) is from the Butterfly database (Xiao et al. 2012), the second one (D2) is a pest data set from Xie et al., 2015, and the last one (D3) is a large data set from Wang et al. 2012. D1 included 1440 butterfly samples. The D2 contained 24 insect species and approximately

Discussions

In this section, the effects of the parameters, such as dictionary learning methods, dictionary sizes, patch sizes and the number of patches, on classification performance are analyzed for inset images.

Conclusions

This paper proposed an unsupervised method with the use of multi-level learning features for the automatic classification of field crop pests. A new multiple-level fusion classification framework was designed that was naturally aligned with the multi-level deep feature learning model. Experimental results showed that the method with multi-level learning features can work better than that with widely used hand-crafted features, such as the SIFT and HOG, on the automatic classification of field

Acknowledgment

This work was supported by the National Natural Science Foundation of China (Nos. 31401293, 61672035, 61773360, and 61300058).

References (28)

  • Arbuckle, T., Schroder, S., Steinhage, V., Wittmann, D., 2001. Biodiversity informatics in action identification and...
  • Bo, L., Ren, X., Fox, D., 2013. Multipath sparse coding using hierarchical matching pursuit. In: Proceedings of the...
  • Coates, A., Ng, A., 2011. The importance of encoding versus training with sparse coding and vector quantization. In:...
  • Coates, A., Ng, A.Y., Lee, H., 2011. An analysis of single-layer networks in unsupervised feature learning. In:...
  • Cited by (0)

    View full text