Introducing the structural bases of typicality effects in deep learning
Introduction
Memory is one of the most amazing faculties of the human being and construed as the brain's ability to code, store, and retrieve information [1], [2], [3], [4]. For decades, understanding and simulating the basis of human learning, cognition processing, and its perception and vision system has been the motivation of the machine intelligence field. In recent years, pattern recognition methods with an impressive performance for some specific tasks related to image interpretation have been developed in the Computer Vision and Image Processing fields. However, these methods still lack in others capabilities compared to human proficiency.
The advent of Convolutional Neural Networks(CNN) outperformed the traditional methods [5], [6] used for image feature representation, and CNN-methods are the leading approaches in semantic image processing tasks such as object recognition [7], semantic segmentation [8], object description [9], etc. Although state-of-the-art CNN-methods have achieved remarkable results, there are still many challenges to attain the discriminative power and the abstraction of human memory (e.g., semantic memory [2], [3]) to represent the semantic of visually acquired information. How to emulate the behavior of human memory in the representation of learned knowledge of objects' features? How to extract and encode such features to encapsulate the representation of the meaning of a specific object? How to infer or ascribe semantics to objects? How to represent the image's meanings and its phenomena? The quest to answer some of these questions still occupies the investigation agenda of many researchers.
Object typicality effects are among these semantic phenomena that are difficult to capture, and they are still challenging for the image computing process. The typicality concept refers to the degree to which the objects under study are considered good examples of the category [10], [11]. For example, the pigeon is a typical case in the bird category since it has several representative features: it can fly, has feathers, beaks, lays eggs, and builds a nest. On the other hand, the penguin is an atypical member since it satisfies only some features but not all. A glance is enough for human beings to perform this type of semantic ranking within the category. In contrast, once objects belong to the same category, machines still lack the ability to capture this semantic phenomenon.
The argument that category membership is a matter of degree came from cognitive psychology with the seminal studies of Rosch and colleagues [10], [11], [12], [13], [14], who referred to membership degree as typicality. In her seminal work [11], Rosch introduced the concept of semantic prototype and presented an in-depth analysis of the internal semantic structure of the category. Rosch [11] holds that the representation of category semantic meaning is related to the category prototype, particularly to those categories denoting natural objects. The Prototype Theory [10], [11], [12], [13], [14], [15] proposes that human beings think categories in terms of abstraction (prototypes), represented by typical category members. This Theory also indicates that the successful execution of object classification and description tasks in the human brain is inherently related to the category prototype learned.
Prototype learning is a representative approach of pattern recognition methods. It has been used in such image processing tasks as Face Recognition [16], [17], Image Segmentation [18], [19], Static Hand Gesture [17], Few-Shot Learning [20], [21], [22], [23], Clustering [24], Robust Image Classification [25], [26], [27], [28], [29], [30], CNN Interpretation [31], [32], etc. Even though these works proposed a wide variety of methods for prototype learning, most of the approaches focus on using prototype learning to improve the performance/robustness of a specific task. As far as we know, little attention is paid to the use of the prototype to capture other semantic properties of the object image, such as its typicality; further, if we consider that the prototype is based on the notion of typicality [11], [12], [14], [15].
Introducing typicality into image processing can increase the generalization power of pattern recognition models [14], [26], [27]. Rosch's experiments [14] showed that when humans learn a category by looking at its most typical samples, they can better recognize new members. Also, some authors [26], [27] showed that CNN-models could not generalize atypical images that are substantially different from training images. When a typicality measure is involved in the learning process, we can improve the image classification task [26], [27]. Moreover, involving the typicality learning in the category's learning process would allow the machines to categorize the images and know their degree of belonging (is it a typical, atypical, or border-image?), a type of semantic representation of the object's image categories that are only achievable by human beings.
This paper relies on cognitive semantic studies related to the Prototype Theory to propose a new perspective to model the central semantic meaning of object categories: the prototype. Unlike other prototype learning approaches [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], we use our prototype representation to capture the concepts of typicality and category membership degree of object's images. Our proposal considers the typicality concept from cognitive psychology, assuming that it is possible to obtain a more natural and interpretable representation of the semantics of the object's image. Specifically, we propose a mathematical framework that endeavors to represent the semantic definition of an object's categories and, consequently, capture the phenomena of the object's typicality. To evaluate our proposal in real-world tasks, we also propose a procedure to introduce our prototype's semantic representation and our typicality measure in the global semantic description of the object's images. Furthermore, we also propose a CNN-layer architecture to evaluate our proposal in classification and transfer learning tasks. Fig. 1 shows the intuition and our basic conceptual steps to apply our prototype's framework to classification and description models.
Section snippets
Prototype theory
The Prototype Theory [10], [11], [12], [13], [14], [15] analyzes the internal structure of semantic categories and proposes categorization based on the prototype. This Theory postulates that semantic categories are not homogeneous structures. According to experimental evidences [10], [11], [12], [13], semantic categories should be considered heterogeneous structures, where their members and their respective features do not have the same relevance within the category.
Rosch and colleagues [10],
Computational prototype model
Rosch's experiments [11], [13] showed that the heterogeneous internal structure of natural semantic categories relates to the concepts of prototype(category core-meaning), typicality, and family resemblance. Family resemblance relationship [11], [13] consists of a set of items of the form AB, BC, CD, DE; i.e., each item has one or more attributes in common with one or more other items, but no attribute needs to be common to all items [11]. The abstract nature of these semantic concepts has made
CPM model semantics in classification and description of object images
Rosch's experiments [10], [11], [12], [13], [14] indicated that category prototypes are cognitive reference point in constructing concepts. We apply the Prototype Theory as a theoretical foundation to represent the semantic of the visual information lying in the basics components of a scene: objects. The observations on the Prototype Theory raise the following question: How can the category's prototype and object's image typicality be included as a reference point in the object global semantic
Datasets
We conducted our experiments on five image datasets. The off-line prototype computation process and the CPM-model representation were conducted using MNIST [43], CIFAR10, CIFAR100 [44], and ImageNet [41] datasets. PS-Layer classification performance was evaluated using MNIST [43], CIFAR10, and CIFAR100 [44]. We evaluated the prototype-based transfer learning performance and our GSDP-descriptor performance using the ImageNet [41] and Coco [42] as real images datasets.
Backbone networks
We evaluated our CPM-model
Conclusion
In this paper, we introduced a Computational Prototype Model (CPM) based on Prototype Theory foundations. Our approach provides another point of view for representation of the internal semantic structure of object categories; and retrieves some results of the experimental-psychology to model the typicality degree of an object's image, one of image's semantic-properties that were still not analyzed by current prototype learning approaches.
We presented a straightforward Prototypical Similarity
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This research was supported by funding from the Brazilian agencies CAPES, CNPq, and FAPEMIG.
References (59)
- et al.
Human memory: a proposed system and its control processes
Psychol. Learn. Motiv.
(1968) - et al.
Speeded-up robust features (surf)
Comp. Vision Image Underst. (CVIU)
(2008) On the internal structure of perceptual and semantic categories
- et al.
Family resemblances: studies in the internal structure of categories
Cogn. Psychol.
(1975) - et al.
Towards explainable deep neural networks
Neural Netw.
(2020) - et al.
Evaluation of prototype learning algorithms for nearest-neighbor classifier in application to handwritten character recognition
Pattern Recogn.
(2001) Coding and representation: searching for a home in the brain
- et al.
Semantic memory
- et al.
Association of pseudomonas putida formaldehyde dehydrogenase with superparamagnetic nanoparticles: an effective way of improving the enzyme stability, performance and recycling
New J. Chem.
(2015) Distinctive image features from scale-invariant keypoints
Int. J. Comp. Vision.
(2004)
Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv preprint
Fully convolutional networks for semantic segmentation
End-to-end learning local multi-view descriptors for 3d point clouds
Cognitive representations of semantic categories
J. Exp. Psychol. Gen.
Structural bases of typicality effects
J. Exp. Psychol. Hum. Percept. Perform.
Principles of categorization
Theories of Lexical Semantics
Prototype based feature learning for face image set classification
Prototype-incorporated emotional neural network
IEEE Trans. Neural Networks Learn. Syst.
Pattern recognition in numerical data sets and color images through the typicality based on the gkpfcm clustering algorithm
Math. Probl. Eng.
Few-shot semantic segmentation with prototype learning
Prototypical priors: From improving classification to zero-shot learning
Prototypical networks for few-shot learning
Gaussian Prototypical Networks for Few-Shot Learning on Omniglot
Infinite mixture prototypes for few-shot learning
Clustering data and vague concepts using prototype theory interpreted label semantics
Optimizing 1-nearest prototype classifiers
Object-centric anomaly detection by attribute-based reasoning
Incorporating prototype theory in convolutional neural networks
Cited by (1)
Contextual Concept Meaning Alignment Based on Prototype Theory
2023, CEUR Workshop Proceedings