Elsevier

Pattern Recognition

Volume 47, Issue 7, July 2014, Pages 2325-2337
Pattern Recognition

Visual learning and classification of human epithelial type 2 cell images through spontaneous activity patterns

https://doi.org/10.1016/j.patcog.2013.10.013Get rights and content

Highlights

  • We propose a novel HEp-2 cell classification system inspired by biological vision.

  • We apply similar patterns to the spontaneous neural activities of new born animals.

  • A generative model of spontaneous activity patterns is proposed.

  • Mixing spontaneous activity patterns with cell images leads to a more robust model.

Abstract

Identifying the presence of anti-nuclear antibody (ANA) in human epithelial type 2 (HEp-2) cells via the indirect immunofluorescence (IIF) protocol is commonly used to diagnose various connective tissue diseases in clinical pathology tests. As it is a labour and time intensive diagnostic process, several computer aided diagnostic (CAD) systems have been proposed. However, the existing CAD systems suffer from numerous shortcomings due to the selection of features, which is commonly based on expert experience. Such a choice of features may not work well when the CAD systems are retasked to another dataset. To address this, in our previous work, we proposed a novel approach that learns a set of filters from HEp-2 cell images. It is inspired by the receptive fields in the mammalian's vision system, since the receptive fields can be thought as a set of filters for similar shapes. We obtain robust filters for HEp-2 cell classification by employing the independent component analysis (ICA) framework. Although, this approach may be held back due to one particular problem; ICA learning requires a sufficiently large volume of training data which is not always available. In this paper, we demonstrate a biologically inspired solution to address this issue via the use of spontaneous activity patterns (SAP). The spontaneous activity patterns, which are related to the spontaneous neural activities initialised by the chemical release in the brain, are found as the typical stimuli for the visual cell development of newborn animals. In the classification system for HEp-2 cells, we propose to model SAP as a set of small image patches containing randomly positioned Gaussian spots. The SAP image patches are generated and mixed with the training images in order to learn filters via the ICA framework. The obtained filters are adopted to extract the set of responses from a HEp-2 cell image. We then employ regions from this set of responses and stack them into “cubic regions”, and apply a classification based on the correlation information of the features. We show that applying the additional SAP leads to a better classification performance on HEp-2 cell images compared to using only the existing patterns for training ICA filters. The improvement on classification is particularly significant when there are not enough specimen images available in the training set, as SAP adds more variations to the existing data that makes the learned ICA model more robust. We show that the proposed approach consistently outperforms three recently proposed CAD systems on two publicly available datasets: ICPR HEp-2 contest and SNPHEp-2.

Introduction

Computer aided diagnostic (CAD) systems have been rapidly developed in recent years for various clinical pathology tests on anti-nuclear antibody (ANA) via indirect immunofluorescence (IIF) [1], [2], [3], [4], [5], [6], [7], [8]. These systems help to improve the efficiency of previously labour intensive diagnostic processes, as well as providing more objective judgements on determining HEp-2 cell patterns.

Most of the existing CAD systems follow the trend of using heuristic features based on expert's experience of the system, or namely “hand-picked” features [8]. Human experts are involved in selecting the most optimum features to represent the images. Local binary pattern (LBP) is one of the more popular choices to describe the cell texture [1], [7]. Methods proposed in [2], [3], [5], [8] combine various features such as standard deviation, entropy, spectral measurements (e.g., Fourier and wavelet transforms), descriptors for shape and structure, and morphological features. However, it has been shown that these features are not robust enough to operate under different laboratory environments [8]. In realistic cases, they may be excellent in one laboratory using specific assays and microscopes, but may not be the best choice in the other laboratories. A more promising approach is to use a probabilistic codebook which computes the local statistics information of an image [8]. Despite notable improvements, the codebook approach still needs to use specific pre-selected local features (e.g., Discrete Cosine Transform (DCT) or SIFT descriptors). Therefore, it only works when one knows the most optimised sets of features to use.

To address the problem caused by the use of heuristic features, we investigate one recent popular approach that applies unsupervised feature learning on natural images [9]. Such an approach is inspired by Olshausen's discoveries regarding the properties of simple cells in the primary visual cortex (or V1 in short) [10]. Olshausen's work shows that localised, oriented and band-pass filters are similar to the receptive fields of simple cells. These filters can be trained by maximising the sparseness of the response when sources are separated as independent components, namely the independent component analysis (ICA) framework [11]. The filters obtained using the ICA framework exhibit sparse activities similar to the simple cells in V1, and these responses can be used as image features [12], [13]. These features are arguably non-heuristic as they are derived from the statistics in the data and have been adopted to improve computer vision systems. For example, Hou et al. use the ICA filters for saliency detection [12]. Here, they derive a saliency map according to the entropy change of the filtered pixel intensity. Another example is the work from Kanan et al. which employs ICA filters in the object classification domain [13]. They obtain a better classification performance than other classification systems by using the saliency map generated by ICA filters.

In our previous work [14], we show that it is possible to achieve high accuracy for classifying HEp-2 cell images by using the ICA filters. We learn filters via the ICA framework from unlabelled cell images, then obtain the responses by filtering the input image. The image descriptor is computed based on these responses. To this end, we extract multiple cubic regions from the responses and stack them into the feature collection matrix which represents the image. We use the SVM classifier in conjunction with the histogram correlation kernel to do the classification. In this paper, we call this method the “Cell Patterns” (CP) approach. Despite the performance improvement exhibited by the CP approach, it is still limited by the fact that the ICA framework requires sufficient training data in order to construct representative filters [15].

In this work, we conduct further evaluations on the properties of receptive field learned from HEp-2 cells in the ICA framework. Particularly, we hope to generalize the property of HEp-2 cell patterns through a mathematical model, and improve the receptive field models by utilizing such a property to enrich HEp-2 cell image patterns. Therefore, we introduce a novel learning model based on the “spontaneous activity patterns” (SAP). Spontaneous activity patterns, proposed by Albert et al. [16], are in general a set of random patterns produced by a generative model (e.g., percolation patterns [17] and randomly positioned Gaussian spots [18]). From the point of view of neurology, the SAP models are similar to the stimuli involved in the development of the structured receptive fields. These receptive fields exist in many newborn animals' vision systems before they first open their eyes [19]. To our knowledge, this is the first work to propose an approach exploiting SAP to improve receptive field models for a pattern recognition problem.

Contributions: Our contributions are as follows:

  • 1.

    We extend the SAP pattern [18] to a generative model of randomly positioned Gaussian spots with different sizes.

  • 2.

    We propose to use generated SAP image patches in addition to the training patches for obtaining the filters via the ICA framework.

  • 3.

    We demonstrate that applying the proposed mixture model improves the performance of our previous work.

  • 4.

    The proposed framework demonstrates robust classification results in different datasets taken from two laboratories.

We continue the paper in the following four sections. Firstly, we introduce the computer vision models adopted in this work and the related concepts of biological vision in neurology in Section 2. Secondly, we describe the details of the proposed CAD system for HEp-2 cells classification in Section 3. Thirdly, we explain our experiments on two publicly available datasets: ICPRContest and SNPHEp-2, then compare the classification result of the proposed model to three recently developed CAD systems in Section 4. Finally, we draw conclusions from this work in Section 5.

Section snippets

Preliminaries

In the following sections, we present some preliminaries of image representation, independent component analysis and the related theories in biological visual neurons. In Section 2.1, we discuss the statistical representation of an image based on a collection of linear bases, followed by the introduction of visual learning model using independent component analysis (in Section 2.2). In Section 2.3, we explain the properties of receptive fields in the mammalian visual system, and the development

Proposed framework for HEp-2 cell classification

The proposed approach contains four steps:

  • 1.

    The SAP model generates SAP image patches.

  • 2.

    The mixture of SAP patches and patches extracted from HEp-2 cell images are fed to the ICA model in order to obtain the set of filters (i.e., filter bank).

  • 3.

    The ICA filter bank learned from the mixture is used to process each cell image. We scan images to obtain “cubic-regions” from the filter responses and apply a pyramid grid to collect average filter responses in the grids.

  • 4.

    The descriptor of each cell image,

Experiments

We introduce the two HEp-2 cell datasets used in our experiments. Then, the proposed system is evaluated with various settings and compared with three recently proposed systems. Our implementation uses OpenCV C++ library4 and LIBSVM tools.5

Main findings

In this paper, we proposed a novel HEp-2 cell image classification approach which learns a set of filters inspired by the receptive fields of the mammalian's vision system. The mammalian's receptive fields have a similar characteristic to filters obtained by the ICA model. Nevertheless, the ICA model requires a sufficiently large amount of training data which may not be always available. We adapted the concept of spontaneous neural stimuli to address this issue. Technically, we proposed a

Conflict of interest

None declared.

Acknowledgements

This research was partly funded by Sullivan Nicolaides Pathology, Australia and the Australian Research Council (ARC) Linkage Projects Grant LP130100230.

Yan Yang is a PhD research student of the University of Queensland and NICTA Queensland research lab. She received bachelor degree in manufacturing engineering of aircraft from Nanjing University of Aeronautics and Astronautics in 2002. Yan has worked as a graduate researcher at Queensland Research Laboratory (QRL), NICTA (2008 to present), and a visiting scholar at National Institute of Informatics (NII), Tokyo, Japan (2012). She has previously worked on surveillance video analysis, person

References (55)

  • P. Elbischger, S. Geerts, K. Sander, G. Ziervogel-Lukas, P. Sinah, Algorithmic framework for HEp-2 fluorescence pattern...
  • P. Strandmark, J. Ulén, F. Kahl, Hep-2 staining pattern classification, in: International Conference on Pattern...
  • T. Hsieh, Y. Huang, C. Chung, Y. Huang, Hep-2 cell classification in indirect immunofluorescence images, in:...
  • P. Soda et al.

    Aggregation of classifiers for staining pattern recognition in antinuclear autoantibodies analysis

    IEEE Trans. Inf. Technol. Biomed.

    (2009)
  • I. Theodorakopoulos, D. Kastaniotis, G. Economou, S. Fotopoulos, Hep-2 cells classification via fusion of morphological...
  • A. Wiliem, Y. Wong, C. Sanderson, P. Hobson, S. Chen, B. Lovell, Classification of human epithelial type 2 cell...
  • A. Hyvärinen et al.
    (2009)
  • B. Olshausen

    Emergence of simple-cell receptive field properties by learning a sparse code for natural images

    Nature

    (1996)
  • X. Hou et al.

    Dynamic visual attentionsearching for coding length increments

    Adv. Neural Inf. Process. Syst.

    (2008)
  • C. Kanan, G. Cottrell, Robust classification of objects, faces, and flowers using natural image statistics, in: IEEE...
  • Y. Yang, A. Wiliem, A. Alavi, P. Hobson, Classification of human epithelial type 2 cell images using independent...
  • M.V. Albert et al.

    Innate visual learning through spontaneous activity patterns

    PLoS Comput. Biol.

    (2008)
  • D. Stauffer et al.

    Introduction to Percolation Theory

    (1994)
  • J.J. Hunt et al.

    Sparse coding on the spotspontaneous retinal waves suffice for orientation selectivity

    Neural Comput.

    (2012)
  • T. Ohshiro et al.

    Development of cortical orientation selectivity in the absence of visual experience with contour

    J. Neurophysiol.

    (2011)
  • X.S. Zhou, T.S. Huang, Image retrieval: feature primitives, feature representation, and relevance feedback, in: IEEE...
  • S. Li, X. Hou, H. Zhang, Q. Cheng, Learning spatially localized, parts-based representation, in: Proceedings of the...
  • Cited by (15)

    • Computer Aided Diagnosis for Anti-Nuclear Antibodies HEp-2 images: Progress and challenges

      2016, Pattern Recognition Letters
      Citation Excerpt :

      Unfortunately, as discussed by Bizzaro et al. [3], Hiemann et al. [25], Pham et al. [48] and Soda et al. [52], the protocol is: (1) time consuming; (2) labor intensive; (3) subjective; (4) has low reproducibility and (5) has large inter/intra- personnel/laboratory variations. To address these issues, there has been a steady on-going effort in the community to develop such systems [1,5,8,10,12,13,15,17,20,25,28,35,47,50,52,53,55,57,61,64,65]. Some relevant benchmarking platforms have also been proposed such as ICPR2012Contest [15,17], SNPHEp-2 [64], ICIP2013 [27]1 and ICPR2014 [26]2.

    • Cell image classification by a scale and rotation invariant dense local descriptor

      2016, Pattern Recognition Letters
      Citation Excerpt :

      Some features are extracted directly from the original image, while others come from the binary image obtained after a thresholding operation. A different approach to cell classification is followed in [50] and [8], where features are learnt automatically from the data. In the first case, a model based on the Independent Component Analysis is used, while the second technique resorts to a Convolutional Neural Network.

    • Classification of ANA HEp-2 slide images using morphological features of stained patterns

      2016, Pattern Recognition Letters
      Citation Excerpt :

      However, it is known that the major step of this medical test – the classification of the obtained immunofluorescent images into classes corresponded to different autoimmune diseases – is subject to errors due to several factors [6,7], which can be eliminated making this step fully automatic. Recently, owing to series of international contests devoted to a solving of the task of automatic ANA HEp-2 image classification [8], a number of approaches were suggested in literature [9–20]. As a first step toward the classification of ANA HEp-2 images the contest organizers suggested for solving the subtask of classification of pre-segmented images of single cells (designated as the Task 1).

    • Ensembles of dense and dense sampling descriptors for the HEp-2 cells classification problem

      2016, Pattern Recognition Letters
      Citation Excerpt :

      It is a region-based approach, which pools local histograms of visual words into three histograms associated to (i) the whole cell region, (ii) the inner region and (iii) the outer region, thus exploiting also the spatial information. Moreover, Foggia et al. [10] report some strategies for augmenting the training set, such as image rotation [13] or spontaneous activity patterns [14]. For the ICPR 2014 contest, two different tasks were assigned: Task 1 for the cell level classification and Task 2 for the specimen level classification.

    • Pattern recognition in stained HEp-2 cells: Where are we now?

      2014, Pattern Recognition
      Citation Excerpt :

      Finally, the classification decision is made by choosing the label of the class whose atoms minimize the reconstruction error. Yang [37]: The main idea of this paper is to exploit Spontaneous Activity Pattern (SAP) [38] in order to increase the size of the training set and then to perform an unsupervised learning for the feature set from the raw data. The classification is performed by a Kernel Support Vector Machine (KSVM), using a correlation kernel based on the Pearson correlation.

    • DGDI: A Dataset for Detecting Glomeruli on Renal Direct Immunofluorescence

      2019, 2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
    View all citing articles on Scopus

    Yan Yang is a PhD research student of the University of Queensland and NICTA Queensland research lab. She received bachelor degree in manufacturing engineering of aircraft from Nanjing University of Aeronautics and Astronautics in 2002. Yan has worked as a graduate researcher at Queensland Research Laboratory (QRL), NICTA (2008 to present), and a visiting scholar at National Institute of Informatics (NII), Tokyo, Japan (2012). She has previously worked on surveillance video analysis, person re-identification, and robust classification on image/video databases. Her research interests are in the areas of machine learning, pattern recognition, and computer vision.

    Arnold Wiliem is a research fellow at the University of Queensland. He received his PhD in 2010 from Queensland University of Technology and his bachelor of Computer Science in 2006 from University of Indonesia. He has previous worked on automated medical image analysis at NICTA for two years before he joined the University of Queensland to pursue his research in the digital pathology domain. His current research interests are in the areas of machine learning, pattern recognition, and computer vision. He is a member of IEEE.

    Azadeh Alavi is a PhD research student of the University of Queensland and NICTA Queensland research lab. She obtained her Bachelor of Applied Mathematics degree in 2002 and worked in industries for about 2 years before commencing her Master of IT-advanced program (Research Stream) at Griffith University. Her Master Thesis was on automating the classification of Dopaminergic neurons images of rodent brain. She was employed by Griffith University as a member of research team for automating wave height and period detection and prediction in addition to automating people counting in beach area and shore line detection and prediction. She also worked under the title of researcher in business for a Brisbane based company to automate part of their professional work. Her interests are in the areas of machine learning, pattern recognition, cell image classification, image processing, Mathematics, and Statistics.

    Brian C. Lovell was born in Brisbane, Australia in 1960. He received the BEng in electrical engineering in 1982, the BSc in computer science in 1983, and the PhD in signal processing in 1991: all from the University of Queensland (UQ). He was Research Leader in National ICT Australia (2006–2012) and Research Director of the Security and Surveillance Research group in the School of ITEE, UQ. He was President of the Australian Pattern Recognition Society 1995–2005, Senior Member of the IEEE, Fellow of the World Innovation Forum, Fellow of the IEAust, and voting member for Australia on the governing board of the International Association for Pattern Recognition since 1998. He was Technical Co-chair of ICPR2006 in Hong Kong (Computer Vision and Image Analysis), and Program Co-chair of ICPR2008 in Tampa, Florida. He serves on the Editorial Board of Pattern Recognition Letters and reviews for many of the major journals in the fields of Computer Vision and Pattern Recognition. In March 2005, he was awarded Number 1 author at UQ with almost 35,000 copies of his papers downloaded from the UQ library archive. His research interests are currently focused on intelligent surveillance techniques, optimal image segmentation, real-time video analysis, and face recognition.

    Peter Hobson graduated from Queensland Institute of Technology with a Bachelor of Applied Science in Medical Technology in 1983. Peter has worked at Sullivan Nicolaides Pathology (SNP) his whole career in various disciplines including Histopathology, Haematology and is currently the laboratory manager of Immunology, Infectious Serology and the Bone Marrow Transplant departments. He has collaborated with biotech company PanBio in assay development, the University of Queensland Environmental Toxicology (EnTox) and clinical trials with Esoterix laboratories and the Queensland Institute of Medical Research. He is responsible for numerous innovative IT projects within SNP. Currently he is collaborating with the University of Queensland to develop innovative digital imaging applications for microscopy.

    View full text