Elsevier

Image and Vision Computing

Volume 113, September 2021, 104249
Image and Vision Computing

Introducing the structural bases of typicality effects in deep learning

https://doi.org/10.1016/j.imavis.2021.104249Get rights and content

Abstract

In this paper, we hypothesize that the effects of the degree of typicality in natural semantic categories can be generated based on the structure of artificial categories learned with deep learning models. Motivated by the human approach to representing natural semantic categories and based on the Prototype Theory foundations, we propose a novel Computational Prototype Model (CPM) to represent the internal structure of semantic categories. Unlike other prototype learning approaches, our mathematical framework proposes a first approach to provide deep neural networks with the ability to model abstract semantic concepts such as category central semantic meaning, typicality degree of an object's image, and family resemblance relationship. We proposed several methodologies based on the typicality's concept to evaluate our CPM-model in image semantic processing tasks such as image classification, a global semantic description, and transfer learning. Our experiments on different image datasets, such as ImageNet and Coco, showed that our approach might be an admissible proposition in the effort to endow machines with greater power of abstraction for the semantic representation of objects' categories.

Introduction

Memory is one of the most amazing faculties of the human being and construed as the brain's ability to code, store, and retrieve information [1], [2], [3], [4]. For decades, understanding and simulating the basis of human learning, cognition processing, and its perception and vision system has been the motivation of the machine intelligence field. In recent years, pattern recognition methods with an impressive performance for some specific tasks related to image interpretation have been developed in the Computer Vision and Image Processing fields. However, these methods still lack in others capabilities compared to human proficiency.

The advent of Convolutional Neural Networks(CNN) outperformed the traditional methods [5], [6] used for image feature representation, and CNN-methods are the leading approaches in semantic image processing tasks such as object recognition [7], semantic segmentation [8], object description [9], etc. Although state-of-the-art CNN-methods have achieved remarkable results, there are still many challenges to attain the discriminative power and the abstraction of human memory (e.g., semantic memory [2], [3]) to represent the semantic of visually acquired information. How to emulate the behavior of human memory in the representation of learned knowledge of objects' features? How to extract and encode such features to encapsulate the representation of the meaning of a specific object? How to infer or ascribe semantics to objects? How to represent the image's meanings and its phenomena? The quest to answer some of these questions still occupies the investigation agenda of many researchers.

Object typicality effects are among these semantic phenomena that are difficult to capture, and they are still challenging for the image computing process. The typicality concept refers to the degree to which the objects under study are considered good examples of the category [10], [11]. For example, the pigeon is a typical case in the bird category since it has several representative features: it can fly, has feathers, beaks, lays eggs, and builds a nest. On the other hand, the penguin is an atypical member since it satisfies only some features but not all. A glance is enough for human beings to perform this type of semantic ranking within the category. In contrast, once objects belong to the same category, machines still lack the ability to capture this semantic phenomenon.

The argument that category membership is a matter of degree came from cognitive psychology with the seminal studies of Rosch and colleagues [10], [11], [12], [13], [14], who referred to membership degree as typicality. In her seminal work [11], Rosch introduced the concept of semantic prototype and presented an in-depth analysis of the internal semantic structure of the category. Rosch [11] holds that the representation of category semantic meaning is related to the category prototype, particularly to those categories denoting natural objects. The Prototype Theory [10], [11], [12], [13], [14], [15] proposes that human beings think categories in terms of abstraction (prototypes), represented by typical category members. This Theory also indicates that the successful execution of object classification and description tasks in the human brain is inherently related to the category prototype learned.

Prototype learning is a representative approach of pattern recognition methods. It has been used in such image processing tasks as Face Recognition [16], [17], Image Segmentation [18], [19], Static Hand Gesture [17], Few-Shot Learning [20], [21], [22], [23], Clustering [24], Robust Image Classification [25], [26], [27], [28], [29], [30], CNN Interpretation [31], [32], etc. Even though these works proposed a wide variety of methods for prototype learning, most of the approaches focus on using prototype learning to improve the performance/robustness of a specific task. As far as we know, little attention is paid to the use of the prototype to capture other semantic properties of the object image, such as its typicality; further, if we consider that the prototype is based on the notion of typicality [11], [12], [14], [15].

Introducing typicality into image processing can increase the generalization power of pattern recognition models [14], [26], [27]. Rosch's experiments [14] showed that when humans learn a category by looking at its most typical samples, they can better recognize new members. Also, some authors [26], [27] showed that CNN-models could not generalize atypical images that are substantially different from training images. When a typicality measure is involved in the learning process, we can improve the image classification task [26], [27]. Moreover, involving the typicality learning in the category's learning process would allow the machines to categorize the images and know their degree of belonging (is it a typical, atypical, or border-image?), a type of semantic representation of the object's image categories that are only achievable by human beings.

This paper relies on cognitive semantic studies related to the Prototype Theory to propose a new perspective to model the central semantic meaning of object categories: the prototype. Unlike other prototype learning approaches [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], we use our prototype representation to capture the concepts of typicality and category membership degree of object's images. Our proposal considers the typicality concept from cognitive psychology, assuming that it is possible to obtain a more natural and interpretable representation of the semantics of the object's image. Specifically, we propose a mathematical framework that endeavors to represent the semantic definition of an object's categories and, consequently, capture the phenomena of the object's typicality. To evaluate our proposal in real-world tasks, we also propose a procedure to introduce our prototype's semantic representation and our typicality measure in the global semantic description of the object's images. Furthermore, we also propose a CNN-layer architecture to evaluate our proposal in classification and transfer learning tasks. Fig. 1 shows the intuition and our basic conceptual steps to apply our prototype's framework to classification and description models.

Section snippets

Prototype theory

The Prototype Theory [10], [11], [12], [13], [14], [15] analyzes the internal structure of semantic categories and proposes categorization based on the prototype. This Theory postulates that semantic categories are not homogeneous structures. According to experimental evidences [10], [11], [12], [13], semantic categories should be considered heterogeneous structures, where their members and their respective features do not have the same relevance within the category.

Rosch and colleagues [10],

Computational prototype model

Rosch's experiments [11], [13] showed that the heterogeneous internal structure of natural semantic categories relates to the concepts of prototype(category core-meaning), typicality, and family resemblance. Family resemblance relationship [11], [13] consists of a set of items of the form AB, BC, CD, DE; i.e., each item has one or more attributes in common with one or more other items, but no attribute needs to be common to all items [11]. The abstract nature of these semantic concepts has made

CPM model semantics in classification and description of object images

Rosch's experiments [10], [11], [12], [13], [14] indicated that category prototypes are cognitive reference point in constructing concepts. We apply the Prototype Theory as a theoretical foundation to represent the semantic of the visual information lying in the basics components of a scene: objects. The observations on the Prototype Theory raise the following question: How can the category's prototype and object's image typicality be included as a reference point in the object global semantic

Datasets

We conducted our experiments on five image datasets. The off-line prototype computation process and the CPM-model representation were conducted using MNIST [43], CIFAR10, CIFAR100 [44], and ImageNet [41] datasets. PS-Layer classification performance was evaluated using MNIST [43], CIFAR10, and CIFAR100 [44]. We evaluated the prototype-based transfer learning performance and our GSDP-descriptor performance using the ImageNet [41] and Coco [42] as real images datasets.

Backbone networks

We evaluated our CPM-model

Conclusion

In this paper, we introduced a Computational Prototype Model (CPM) based on Prototype Theory foundations. Our approach provides another point of view for representation of the internal semantic structure of object categories; and retrieves some results of the experimental-psychology to model the typicality degree of an object's image, one of image's semantic-properties that were still not analyzed by current prototype learning approaches.

We presented a straightforward Prototypical Similarity

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This research was supported by funding from the Brazilian agencies CAPES, CNPq, and FAPEMIG.

References (59)

  • K. Simonyan et al.

    Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv preprint

  • J. Long et al.

    Fully convolutional networks for semantic segmentation

  • L. Li et al.

    End-to-end learning local multi-view descriptors for 3d point clouds

  • E. Rosch

    Cognitive representations of semantic categories

    J. Exp. Psychol. Gen.

    (1975)
  • E. Rosch et al.

    Structural bases of typicality effects

    J. Exp. Psychol. Hum. Percept. Perform.

    (1976)
  • E. Rosch

    Principles of categorization

  • D. Geeraerts

    Theories of Lexical Semantics

    (2010)
  • Mingbo Ma et al.

    Prototype based feature learning for face image set classification

  • O.K. Oyedotun et al.

    Prototype-incorporated emotional neural network

    IEEE Trans. Neural Networks Learn. Syst.

    (2018)
  • B. Ojeda-Magaña et al.

    Pattern recognition in numerical data sets and color images through the typicality based on the gkpfcm clustering algorithm

    Math. Probl. Eng.

    (2013)
  • N. Dong et al.

    Few-shot semantic segmentation with prototype learning

  • S. Jetley et al.

    Prototypical priors: From improving classification to zero-shot learning

  • J. Snell et al.

    Prototypical networks for few-shot learning

  • S. Fort

    Gaussian Prototypical Networks for Few-Shot Learning on Omniglot

  • K. Allen et al.

    Infinite mixture prototypes for few-shot learning

  • H. Zhao et al.

    Clustering data and vague concepts using prototype theory interpreted label semantics

  • P. Wohlhart et al.

    Optimizing 1-nearest prototype classifiers

  • B. Saleh et al.

    Object-centric anomaly detection by attribute-based reasoning

  • B. Saleh et al.

    Incorporating prototype theory in convolutional neural networks

  • Cited by (1)

    View full text