Authors:
André Pomp
1
;
Lucian Poth
2
;
Vadim Kraus
1
and
Tobias Meisen
3
Affiliations:
1
Institute of Information Management in Mechanical Engineering, RWTH Aachen University, Aachen and Germany
;
2
Computer Science, RWTH Aachen University, Aachen and Germany
;
3
Chair of Technologies and Management of Digital Transformation, University of Wuppertal, Wuppertal and Germany
Keyword(s):
Semantic Model, Knowledge Graph, Ontologies, Semantic Similarity, Machine Learning.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Artificial Intelligence and Decision Support Systems
;
Biomedical Engineering
;
Cloud Computing
;
Coupling and Integrating Heterogeneous Data Sources
;
Data Engineering
;
Databases and Information Systems Integration
;
Enterprise Information Systems
;
Health Information Systems
;
Industrial Applications of Artificial Intelligence
;
Information Systems Analysis and Specification
;
Knowledge Engineering and Ontology Development
;
Knowledge Management
;
Knowledge-Based Systems
;
Ontologies and the Semantic Web
;
Ontology Engineering
;
Semantic Web Technologies
;
Services Science
;
Society, e-Business and e-Government
;
Software Agents and Internet Computing
;
Symbolic Systems
;
Web Information Systems and Technologies
Abstract:
Due to the digitalization of many processes in companies and the increasing networking of devices, there is an ever-increasing amount of data sources and corresponding data sets. To make these data sets accessible, searchable and understandable, recent approaches focus on the creation of semantic models by domain experts, which enable the annotation of the available data attributes with meaningful semantic concepts from knowledge graphs. For simplifying the annotation process, recommendation engines based on the data attribute labels can support this process. However, as soon as the labels are incomprehensible, cryptic or ambiguous, the domain expert will not receive any support. In this paper, we propose a semantic concept recommendation for data attributes based on the data values rather than on the label. Therefore, we extend knowledge graphs to learn different dedicated data representations by including data instances. Using different approaches, such as machine learning, rules o
r statistical methods, enables us to recommend semantic concepts based on the content of data points rather than on the labels. Our evaluation with public available data sets shows that the accuracy improves when using our flexible and dedicated classification approach. Further, we present shortcomings and extension points that we received from the analysis of our evaluation.
(More)