Domains and context: First steps towards managing diversity in knowledge
Introduction
Semantics is core in many knowledge management applications, such as natural language data and metadata understanding [20], [22], [23], [24], natural language driven image generation [54], abstract reasoning [55], [56], converting classifications into formal ontologies [7], [27], [28], automatic classification [25], [26], ontology matching [17], [18], [19] and semantic search [29]. However, despite the progress made, one of the main barriers towards the success of these applications is the lack of background knowledge. In fact, as underlined by several studies (see for instance [8], [9], [10], [11], [51]) without high quality and contextually relevant background knowledge it is impossible to achieve accurate enough results.
Dealing with this problem has turned out to be a very difficult task. In fact, on the one hand, in order to provide all the possible meanings of the words and how they are related to each other, the background knowledge should be very large and virtually unbound. On the other hand, the background knowledge should be context sensitive and able to capture the diversity of the world. The world is extremely diverse and diversity is visibly manifested in language, data and knowledge. The same real world object can be referred to with many different words in different communities and in different languages. For instance, it is widely known that in some Nordic circumpolar groups of people the notion of snow is denoted with hundreds of different words in the local language carrying very fine grained distinctions [1]. This phenomenon is often a function of the role and importance of the real world object in the life of a community. Conversely, the same word may denote different notions in different domains; for instance, bug as insect in entomology and bug as a failure or defect in a computer program in computer science. Space, time, individual goals, needs, competences, beliefs, culture, opinions and personal experience also play an important role in characterizing the meaning of a word. Diversity is an unavoidable and intrinsic property of the world and as such it cannot be avoided. At the same time, diversity is a local maximum since it aims at minimizing the effort and maximizing the gain [35].
Our approach is to take into account this diversity and exploit it to make explicit the local semantics, i.e. the meaning of words in a certain context, such that information becomes unambiguous to humans as well as to machines. Towards this goal a preliminary step is the creation of a diversity-aware knowledge base. This requires appropriate methodologies for its representation, construction and maintenance. With this purpose, we propose and adapt the faceted approach, a well-established methodology used in library science for the organization of knowledge in libraries [21]. In this paper, we describe the fundamental notions of domain and its components, called facets, which allow capturing diversity and, at the same time, allow for an incremental growth of the knowledge base.
The rest of the paper is organized as follows. In Section 2, we explain the main steps of our approach by taking semantic matching as an example. Semantic matching has been chosen because of its intrinsic importance, witnessed by the large amount of research and publications in this area and also because it was the main motivation which originally led us to the problem of managing diversity. In Section 3, we provide the definitions of domain and facet, we present the corresponding data model and describe their fundamental properties. In Section 4, we provide our definition of context and explain how to build and use it at run-time by selecting from the background knowledge the language and knowledge of the domains which are relevant to the problem. In Section 5, we describe and provide a brief description of the diversity-aware knowledge base that we have been developing. Section 6 focuses on the related work in terms of the notion of context (Section 6.1), methodologies for the construction and maintenance of domain knowledge (Section 6.2), existing knowledge bases and approaches followed for their construction (Section 6.3). Section 7 concludes the paper by summarizing the work done, listing the open problems and outlying the future work.
Section snippets
Diversity-aware semantic matching
Consider the example in Fig. 1. It represents two very simple classifications that, for instance, might have been created by two different persons. Round nodes represent categories while rectangles exemplify annotated documents. Solid arrows between nodes represent sub-category relations while dashed arrows denote the fact that a document is categorized into a certain category. Corresponding labels are also given attached to nodes. Initially, we do not know the circumstance in which they were
Domains and facets
The methodology we propose for the construction of domain knowledge is mainly inspired by the faceted approach, a well-established technique introduced by the Indian librarian Ranganathan [21] at the beginning of the last century and used with profit in library science for building classificatory structures from atomic concepts which are analyzed into macro-categories and combined by the application of what in jargon is called the system syntax [47]. The methodology is centered on the
Building the context
Following [35], we define a context as a 4-tuplewhere:
- •
id is an identifier for the context
- •
Lc is the local (formal) language
- •
Kc is the local knowledge
- •
IA is a set of implicit assumptions.
In the case of semantic matching, implicit assumptions consist of a selection of the domains which are relevant to understanding the meaning of the words in a certain framework. Our baseline algorithm for domain recognition consists of parsing node labels and documents in classifications, linking
Creating a diversity-aware knowledge base
We have been developing a framework and a diversity-aware knowledge base currently covering an initial set of domains necessary for the kinds of scenarios we need to serve, but – in the spirit of the proposed approach – extensible according to the local scope, purpose, language and personal experience.
The expressive power of the representation language of our background knowledge is that of propositional DL with only conjunctions, no negations and no disjunctions. The expressive power we
The notion of context
Based on two different approaches, the first formal theories on context were proposed by McCarthy [13] and Giunchiglia [3].
According to McCarthy, contexts are a way to partition knowledge into a limited set of locally true axioms with common assumptions. This set of axioms should be at the right level of abstraction thus excluding irrelevant details in order to simplify local reasoning as much as possible. This is known as the generality principle [12]. In this setting, it is always possible to
Conclusions and future work
In this paper, by observing that the lack of background knowledge represents one of the main obstacles towards the success of semantics, we have stressed the necessity for a very large virtually unbound knowledge base able to capture the diversity of the world as well as to reduce the complexity of reasoning at run-time.
We have proposed the faceted approach, a well-established methodology centered on the fundamental notions of domain and facet and practiced with success in library science for
Acknowledgements
The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007–2013) under Grant agreement No. 231126 LivingKnowledge: LivingKnowledge – Facts, Opinions and Bias in Time. We want to express our gratitude to all the people working with us in the KnowDive group at University of Trento (http://disi.unitn.it/~knowdive/) for their contribution in the creation and population of the knowledge base and in particular to Ilya Zaihrayeu and
References (59)
- et al.
local model semantics or contextual reasoning = locality + compatibility
Artificial Intelligence
(2001) - Artic Climate Impact Assessment, Cambridge University Press, 2005, p....
- B. Dutta, F. Giunchiglia, V. Maltese, A facet-based methodology for geo-spatial modelling, GEOS,...
Contextual reasoning
Epistemologica – Special Issue on I Linguaggi e le Macchine
(1993)- L. Prusak, Knowledge in Organizations, in: M. Polanyi (Ed.), The tacit dimension, 1997 (chapter...
- F. Giunchiglia, B. Dutta, V. Maltese, Faceted lightweight ontologies, in: Conceptual Modeling: Foundations and...
- et al.
Encoding classifications into lightweight ontologies
Journal of Data Semantics
(2006) - et al.
Discovering missing background knowledge in ontology matching
European Conference on Artificial Intelligence ECAI
(2006) - B. Lauser, G. Johannsen, C. Caracciolo, J. Keizer, W.R. van Hage, P. Mayr, Comparing human and automatic thesaurus...
- P. Shvaiko, J. Euzenat. Ten Challenges for Ontology Matching, 7th Int. Conference on Ontologies, Databases, and...
Generality in artificial intelligence
Communications of ACM
Theories and uses of context in knowledge representation and reasoning
Journal of Pragmatics
Ontology Matching
Save up to 99% of your time in mapping validation, 9th International Conference on Ontologies
Lightweight parsing of classifications into lightweight ontologies
ECDL
Attempto controlled english meets the challenges of knowledge representation, reasoning, interoperability and user interfaces
FLAIRS Conference
Cited by (38)
Designing for practice-based context-awareness in ubiquitous e-health environments
2017, Computers and Electrical EngineeringCitation Excerpt :The approach, which we refer to as ContextMorph [23], is based on the case-based reasoning (CBR) methodology (Fig. 4). A key assumption of CBR is that, in real-world problem-solving, people understand new experiences in terms on past ones, which naturally lends the methodology to problems of reasoning about situational context [18] and work practices [23]. The use of context to guide CBR has offered a new and powerful way of enclosing contexts with cases and embedding cases in general domain models in order to enhance the possibilities to simulate user behaviour and generate appropriate recommendations, enable intelligent situation awareness and decision support [26,27], and facilitate knowledge-intensive reasoning in socio-technical systems.
Addressing Digital Transformation in Universities: How to Effectively Govern, Trust and Value Institutional Data
2024, Journal of Telecommunications and the Digital EconomyContext-based understanding of food-related queries using a culinary knowledge model
2023, Journal of Information ScienceAn Architecture and a Methodology Enabling Interoperability within and across Universities
2022, Proceedings - 13th IEEE International Conference on Knowledge Graph, ICKG 2022Empowering Users in Online Open Communities
2021, SN Computer ScienceTransparency Paths - Documenting the Diversity of User Perceptions
2021, UMAP 2021 - Adjunct Publication of the 29th ACM Conference on User Modeling, Adaptation and Personalization