Domains and context: First steps towards managing diversity in knowledge

https://doi.org/10.1016/j.websem.2011.11.007Get rights and content

Abstract

Despite the progress made, one of the main barriers towards the use of semantics is the lack of background knowledge. Dealing with this problem has turned out to be a very difficult task because on the one hand the background knowledge should be very large and virtually unbound and, on the other hand, it should be context sensitive and able to capture the diversity of the world, for instance in terms of language and knowledge. Our proposed solution consists in addressing the problem in three steps: (1) create an extensible diversity-aware knowledge base providing a continuously growing quantity of properly organized knowledge; (2) given the problem, build at run-time the proper context within which perform the reasoning; (3) solve the problem. Our work is based on two key ideas. The first is that of using domains, i.e. a general semantic-aware methodology and technique for structuring the background knowledge. The second is that of building the context of reasoning by a suitable combination of domains. Our goal in this paper is to introduce the overall approach, show how it can be applied to an important use case, i.e. the matching of classifications, and describe our first steps towards the construction of a large scale diversity-aware knowledge base.

Introduction

Semantics is core in many knowledge management applications, such as natural language data and metadata understanding [20], [22], [23], [24], natural language driven image generation [54], abstract reasoning [55], [56], converting classifications into formal ontologies [7], [27], [28], automatic classification [25], [26], ontology matching [17], [18], [19] and semantic search [29]. However, despite the progress made, one of the main barriers towards the success of these applications is the lack of background knowledge. In fact, as underlined by several studies (see for instance [8], [9], [10], [11], [51]) without high quality and contextually relevant background knowledge it is impossible to achieve accurate enough results.

Dealing with this problem has turned out to be a very difficult task. In fact, on the one hand, in order to provide all the possible meanings of the words and how they are related to each other, the background knowledge should be very large and virtually unbound. On the other hand, the background knowledge should be context sensitive and able to capture the diversity of the world. The world is extremely diverse and diversity is visibly manifested in language, data and knowledge. The same real world object can be referred to with many different words in different communities and in different languages. For instance, it is widely known that in some Nordic circumpolar groups of people the notion of snow is denoted with hundreds of different words in the local language carrying very fine grained distinctions [1]. This phenomenon is often a function of the role and importance of the real world object in the life of a community. Conversely, the same word may denote different notions in different domains; for instance, bug as insect in entomology and bug as a failure or defect in a computer program in computer science. Space, time, individual goals, needs, competences, beliefs, culture, opinions and personal experience also play an important role in characterizing the meaning of a word. Diversity is an unavoidable and intrinsic property of the world and as such it cannot be avoided. At the same time, diversity is a local maximum since it aims at minimizing the effort and maximizing the gain [35].

Our approach is to take into account this diversity and exploit it to make explicit the local semantics, i.e. the meaning of words in a certain context, such that information becomes unambiguous to humans as well as to machines. Towards this goal a preliminary step is the creation of a diversity-aware knowledge base. This requires appropriate methodologies for its representation, construction and maintenance. With this purpose, we propose and adapt the faceted approach, a well-established methodology used in library science for the organization of knowledge in libraries [21]. In this paper, we describe the fundamental notions of domain and its components, called facets, which allow capturing diversity and, at the same time, allow for an incremental growth of the knowledge base.

The rest of the paper is organized as follows. In Section 2, we explain the main steps of our approach by taking semantic matching as an example. Semantic matching has been chosen because of its intrinsic importance, witnessed by the large amount of research and publications in this area and also because it was the main motivation which originally led us to the problem of managing diversity. In Section 3, we provide the definitions of domain and facet, we present the corresponding data model and describe their fundamental properties. In Section 4, we provide our definition of context and explain how to build and use it at run-time by selecting from the background knowledge the language and knowledge of the domains which are relevant to the problem. In Section 5, we describe and provide a brief description of the diversity-aware knowledge base that we have been developing. Section 6 focuses on the related work in terms of the notion of context (Section 6.1), methodologies for the construction and maintenance of domain knowledge (Section 6.2), existing knowledge bases and approaches followed for their construction (Section 6.3). Section 7 concludes the paper by summarizing the work done, listing the open problems and outlying the future work.

Section snippets

Diversity-aware semantic matching

Consider the example in Fig. 1. It represents two very simple classifications that, for instance, might have been created by two different persons. Round nodes represent categories while rectangles exemplify annotated documents. Solid arrows between nodes represent sub-category relations while dashed arrows denote the fact that a document is categorized into a certain category. Corresponding labels are also given attached to nodes. Initially, we do not know the circumstance in which they were

Domains and facets

The methodology we propose for the construction of domain knowledge is mainly inspired by the faceted approach, a well-established technique introduced by the Indian librarian Ranganathan [21] at the beginning of the last century and used with profit in library science for building classificatory structures from atomic concepts which are analyzed into macro-categories and combined by the application of what in jargon is called the system syntax [47]. The methodology is centered on the

Building the context

Following [35], we define a context as a 4-tuplectx=<id,Lc,Kc,IA>where:

  • id is an identifier for the context

  • Lc is the local (formal) language

  • Kc is the local knowledge

  • IA is a set of implicit assumptions.

In the case of semantic matching, implicit assumptions consist of a selection of the domains which are relevant to understanding the meaning of the words in a certain framework. Our baseline algorithm for domain recognition consists of parsing node labels and documents in classifications, linking

Creating a diversity-aware knowledge base

We have been developing a framework and a diversity-aware knowledge base currently covering an initial set of domains necessary for the kinds of scenarios we need to serve, but – in the spirit of the proposed approach – extensible according to the local scope, purpose, language and personal experience.

The expressive power of the representation language of our background knowledge is that of propositional DL with only conjunctions, no negations and no disjunctions. The expressive power we

The notion of context

Based on two different approaches, the first formal theories on context were proposed by McCarthy [13] and Giunchiglia [3].

According to McCarthy, contexts are a way to partition knowledge into a limited set of locally true axioms with common assumptions. This set of axioms should be at the right level of abstraction thus excluding irrelevant details in order to simplify local reasoning as much as possible. This is known as the generality principle [12]. In this setting, it is always possible to

Conclusions and future work

In this paper, by observing that the lack of background knowledge represents one of the main obstacles towards the success of semantics, we have stressed the necessity for a very large virtually unbound knowledge base able to capture the diversity of the world as well as to reduce the complexity of reasoning at run-time.

We have proposed the faceted approach, a well-established methodology centered on the fundamental notions of domain and facet and practiced with success in library science for

Acknowledgements

The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007–2013) under Grant agreement No. 231126 LivingKnowledge: LivingKnowledge – Facts, Opinions and Bias in Time. We want to express our gratitude to all the people working with us in the KnowDive group at University of Trento (http://disi.unitn.it/~knowdive/) for their contribution in the creation and population of the knowledge base and in particular to Ilya Zaihrayeu and

References (59)

  • C. Ghidini et al.

    local model semantics or contextual reasoning = locality + compatibility

    Artificial Intelligence

    (2001)
  • Artic Climate Impact Assessment, Cambridge University Press, 2005, p....
  • B. Dutta, F. Giunchiglia, V. Maltese, A facet-based methodology for geo-spatial modelling, GEOS,...
  • F. Giunchiglia

    Contextual reasoning

    Epistemologica – Special Issue on I Linguaggi e le Macchine

    (1993)
  • L. Prusak, Knowledge in Organizations, in: M. Polanyi (Ed.), The tacit dimension, 1997 (chapter...
  • F. Giunchiglia, B. Dutta, V. Maltese, Faceted lightweight ontologies, in: Conceptual Modeling: Foundations and...
  • F. Giunchiglia et al.

    Encoding classifications into lightweight ontologies

    Journal of Data Semantics

    (2006)
  • F. Giunchiglia et al.

    Discovering missing background knowledge in ontology matching

    European Conference on Artificial Intelligence ECAI

    (2006)
  • B. Lauser, G. Johannsen, C. Caracciolo, J. Keizer, W.R. van Hage, P. Mayr, Comparing human and automatic thesaurus...
  • P. Shvaiko, J. Euzenat. Ten Challenges for Ontology Matching, 7th Int. Conference on Ontologies, Databases, and...
  • Z. Aleksovski, W. ten Kate, F. van Harmelen, Using multiple ontologies as background knowledge in ontology matching,...
  • J. McCarthy

    Generality in artificial intelligence

    Communications of ACM

    (1987)
  • J. McCarthy, Notes on formalizing context, in: Bajcsy, R. (Ed.), Thirteenth International Joint Conference on...
  • R. Guha, D. Lenat, Context dependence of representations in cyc, Colloque ICO,...
  • P. Bouquet et al.

    Theories and uses of context in knowledge representation and reasoning

    Journal of Pragmatics

    (2003)
  • P. Shvaiko et al.

    Ontology Matching

    (2007)
  • F. Giunchiglia, M. Yatskevich, E. Giunchiglia, Efficient semantic matching, European Semantic Web Conference ESWC,...
  • F. Giunchiglia, M. Yatskevich, P. Shvaiko, Semantic Matching: algorithms and implementation, Journal on Data Semantics...
  • V. Maltese et al.

    Save up to 99% of your time in mapping validation, 9th International Conference on Ontologies

    (2010)
  • A. Autayeu et al.

    Lightweight parsing of classifications into lightweight ontologies

    ECDL

    (2010)
  • S.R. Ranganathan, Prolegomena to library classification, Asia Publishing House,...
  • N.E. Fuchs et al.

    Attempto controlled english meets the challenges of knowledge representation, reasoning, interoperability and user interfaces

    FLAIRS Conference

    (2006)
  • R. Schwitter, M. Tilbrook, Lets talk in description logic via controlled natural language, LENLS,...
  • I. Zaihrayeu, L. Sun, F. Giunchiglia, W. Pan, Q. Ju, M. Chi, X. Huang, From web directories to ontologies: natural...
  • F. Giunchiglia, I. Zaihrayeu, U. Kharkevich, Formalizing the get-specific document classification algorithm, 11th...
  • F. Giunchiglia, M. Marchese, I. Zaihrayeu. Encoding classifications into lightweight ontologies. European Semantic Web...
  • P. Bouquet, L. Serafini, S. Zanobini, Semantic coordination: a new approach and an application, 2nd International...
  • B. Magnini, L. Serafini, M. Speranza, Making explicit the semantics hidden in schema models, Workshop on Human Language...
  • F. Giunchiglia, U. Kharkevich, I. Zaihrayeu, Concept search, European Semantic Web Conference ESWC,...
  • Cited by (38)

    • Designing for practice-based context-awareness in ubiquitous e-health environments

      2017, Computers and Electrical Engineering
      Citation Excerpt :

      The approach, which we refer to as ContextMorph [23], is based on the case-based reasoning (CBR) methodology (Fig. 4). A key assumption of CBR is that, in real-world problem-solving, people understand new experiences in terms on past ones, which naturally lends the methodology to problems of reasoning about situational context [18] and work practices [23]. The use of context to guide CBR has offered a new and powerful way of enclosing contexts with cases and embedding cases in general domain models in order to enhance the possibilities to simulate user behaviour and generate appropriate recommendations, enable intelligent situation awareness and decision support [26,27], and facilitate knowledge-intensive reasoning in socio-technical systems.

    • An Architecture and a Methodology Enabling Interoperability within and across Universities

      2022, Proceedings - 13th IEEE International Conference on Knowledge Graph, ICKG 2022
    • Transparency Paths - Documenting the Diversity of User Perceptions

      2021, UMAP 2021 - Adjunct Publication of the 29th ACM Conference on User Modeling, Adaptation and Personalization
    View all citing articles on Scopus
    View full text