survey

Open Access

Defining a Knowledge Graph Development Process Through a Systematic Review

Authors:
Gytė Tamašauskaitė

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands

0000-0001-9033-4976
View Profile

,
Paul Groth

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands

0000-0003-0183-6910
View Profile

ACM Transactions on Software Engineering and Methodology Volume 32 Issue 1Article No.: 27pp 1–40https://doi.org/10.1145/3522586

Published:13 February 2023Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

Knowledge graphs are widely used in industry and studied within the academic community. However, the models applied in the development of knowledge graphs vary. Analysing and providing a synthesis of the commonly used approaches to knowledge graph development would provide researchers and practitioners a better understanding of the overall process and methods involved. Hence, this article aims at defining the overall process of knowledge graph development and its key constituent steps. For this purpose, a systematic review and a conceptual analysis of the literature was conducted. The resulting process was compared to case studies to evaluate its applicability. The proposed process suggests a unified approach and provides guidance for both researchers and practitioners when constructing and managing knowledge graphs.

1 INTRODUCTION

Knowledge graphs—graph-structured knowledge bases [57]—are widely employed to represent structured knowledge and perform a variety of AI driven tasks in the context of diverse, dynamic, and large-scale data [32, 87]. Given this increasing adoption, there is a need for guidance on knowledge graph development that would assist researchers, developers, and engineers in the process of creating and maintaining knowledge graphs [9]. While there are descriptions of methods for knowledge graph development [37, 80], that outline the necessary steps to take in order to develop a knowledge graph, these methods vary per article and there is a lack of a global view of the development of these software artifacts.

While generally applicable development processes exist in such areas as software development [3], ontology construction [26], and knowledge engineering [64]; it is unclear to what extent these existing theories can be directly applied to knowledge graph development, due to the complex combination of data and software used for their construction. Indeed, from a software engineering perspective, knowledge graphs provide a fascinating area for study given their inherent combination of software, data, and often human components.

Thus, considering the growth of knowledge graphs and a lack of global process view of their development, this article focuses on formulating key process steps when managing the construction and maintenance of knowledge graphs. Specifically, this article contributes a: A synthesis of common steps in knowledge graph development described in the academic literature. The aim is to provide guidance for both academia and industry in planning and managing the process of knowledge graph development. Moreover, we hope this analysis can provide for a better understanding of how other development lifecycles can be applied to knowledge graphs.

This article is structured as follows: Section 2 covers related work in the area of knowledge graphs. Then, the methodology behind the systematic review is presented in Section 3. This is followed by the results of the review, in Section 4, which describe the proposed knowledge graph development process, its steps, and how they interrelate. The process is assessed by mapping the proposed steps to the case studies in Section 5. Finally, Section 6 discusses the strengths and limitations of the research. Section 7 outlines the main findings and future work.

2 RELATED WORK

This section presents knowledge graphs, trends in their development and development practices more broadly.

2.1 Knowledge Graphs

The term “knowledge graph” was first used in 1972; however, it became widely adopted after 2012, following the announcement of the Google Knowledge Graph [1, 29]. This event also led to the growth of the development and use of knowledge graphs in industry [27, 32, 58].

The term “knowledge graph” can be defined as “a graph of data intended to accumulate and convey knowledge of the real world, whose nodes represent entities of interest and whose edges represent relations between these entities” [32]. Thus, knowledge graphs are structured to represent facts that cover entities, relations, and semantic descriptions [37]. Knowledge graphs can be formally defined as a directed graph (G), where \(G = (V, E)\) [2]. V refers to the vertices (V) or nodes that represent the real-world entities. E refers to the edges (E) or links between the nodes that represent the relations between the entities. Commonly, entities and their relations are presented as triples (subject, predicate, and object) [2] and in graph form (see Figure 1).

Fig. 1. Entities and relations in a knowledge graph [37].

Knowledge graphs are used for multiple tasks, including search and querying (e.g., Google, Bing), serving as a semantic database (e.g., Wikidata), and big data analytics (e.g., Walmart) [87]. In practice, the literature distinguishes between two types of knowledge graphs—generic knowledge graphs and domain-specific knowledge graphs [2]. The first type provides access to multiple domains, commonly with encyclopedic content, e.g., Wikidata [71], YAGO [68], and DBpedia [7, 44]. The second type is focused on a more narrow domain, often for a specific problem or industry [2]. In this article, both types are included in the analysis to ensure a broad overview of the field.

2.2 Trends in Knowledge Graph Development

Knowledge graph development is commonly categorised into two types, either as top-down or bottom-up [2, 23, 45, 92]. The top-down approach refers to when the ontology (or data schema) is defined first and, based on the ontology, knowledge is extracted [45]. The bottom-up approach refers to when the knowledge is extracted from data and, based on the data, the ontology of the knowledge graph is defined [45].

Current research presents multiple instances of how knowledge graphs can be developed [37, 80, 87, 88]. However, it commonly focuses on state-of-the-art techniques (e.g., machine-learning and other advanced algorithms) that can be used in the development of knowledge graphs rather than the overall process of knowledge graph development.

For example, the techniques discussed in one study [87] include data extraction from various sources, harvesting relations between entities, building rules and inference, as well as storage and management of the knowledge graph. In another study [80], the techniques are grouped differently—knowledge integration, entity discovery and typing, entity canonicalisation, construction of attributes and relationships, open schema construction, and knowledge base curation. Yet, another study [88] focuses on the techniques of structured knowledge extraction, classification and non-classification relationship extraction, and graph optimisation. Thus, there are both different approaches to as well as different vocabularies used with respect to knowledge graph development.

Therefore, this article focuses on reviewing different knowledge graph development processes presented in the literature. It contributes to the field by providing a summary of how knowledge graphs are being constructed as well as providing a synthesized description of the process.

2.3 Applicability of Existing Development Processes

Similar processes of development are described in other areas of computer science, for example, in software engineering or ontology construction.

In software engineering, there are several development life cycles, e.g., waterfall, V-model, incremental, iterative, and spiral [3]. While, in general, these life cycles could be applied when developing the knowledge graph; it is not known to what extent it could cover the specific requirements of the knowledge graph development.

In ontology construction, there are also several approaches, such as the Cyc method, Uschold and King’s method, the Grüninger and Fox’s methodology, the KACTUS approach, METHONTOLOGY, and others [26]. Ontologies and knowledge graphs have similarities, though, ontologies primarily focus on capturing the knowledge models (i.e., data models), while knowledge graphs primarily focus on capturing the large amounts of data itself [63]. Additionally, ontology construction is commonly seen as one of the steps in knowledge graph development [23, 92]. Thus, it is not apparent whether ontology construction methodologies are fully suitable for knowledge graph development.

While it is useful to understand these existing approaches; it is also beneficial to take into account the specificity of knowledge graph development. Understanding how knowledge graphs are developed allows for better insights into how these existing approaches can be applied.

3 METHODOLOGY

To understand the overall process of knowledge graph development, we conducted a systematic review of the literature to understand the key process steps—identifying, describing, and integrating these concepts. To evaluate the applicability of the process, we compared it to real-world case studies. The methodology was designed based on the principles for systematic reviews in software engineering [43] and the main phases of the conceptual framework analysis [35].

The following sections present the details of how the data collection and data analysis were conducted as well as the evaluation approach.

3.1 Data Collection

As a basis for this article, relevant and recent research articles were collected and analysed. The overall flow of selecting articles for the systematic review is presented in Figure 2 as a PRISMA workflow [60]. The data collection and screening was performed by a single author with checks in terms of protocol conducted by the other author.

Fig. 2. The PRISMA workflow of selecting articles for the systematic review.

Data sources. Articles were collected from eight online well-established data sources for academic research (ACM Digital Library, IEEExplore, ScienceDirect, arXiv, SpringerLink, Zeta Alpha-AI Research Navigator, Semantic Scholar, and Google Scholar) within the period of March-April 2021. The majority of the sources are recommended in particular when performing software engineering reviews [43].

Inclusion and exclusion criteria. For the search, two keywords were used: knowledge graph development and knowledge graph construction. Only articles from 2012 onward were considered as the growth of the topic started in 2012 with the announcement of Google Knowledge Graph [32]. Only articles in English were reviewed. The first most relevant (as determined by the data source’s ranking) 50 articles per source were screened, setting a threshold for prioritising the review of articles due to a large number of identified articles and decreasing relevancy of search results [59].

First, the title and abstract were screened to determine whether the article covers the knowledge graph development. Then, the content of the article was skimmed to assess if it covers the explicit process steps. If the article met these criteria, it was added to the reference management system for further analysis.

Considering that the articles were chosen from credible sources and that the articles focused on knowledge graphs as a result, rather than reflecting on its development process, the evaluation of the experimental results of the articles was not performed.

Search outcome. Overall, 57 articles were selected for the analysis ranging from 2016 to 2021¹ (the full list is in Appendix A), that given focused time period ensures that the totality of relevant articles are covered. The distribution of the year of publication is presented in Table 1.

Table 1.

Year	Count of articles
2016	1
2017	5
2018	10
2019	11
2020	24
2021	6
Type of article
Domain-specific	47
Methodological	10
Type of KG development
Bottom-up	41
Top-down	16

View Table

Table 1. Summary Details of the Selected Articles

The majority of the articles were covering the development of domain-specific knowledge graphs (Table 1). These articles focus on presenting knowledge graphs built for a specific purpose and what techniques were used for their development. Another type of article was categorised as methodological, presenting a more theoretical overview of knowledge graphs and development methods.

Furthermore, the majority of articles covered bottom-up knowledge graph development approach (Table 1). Although the majority of articles do not indicate the type of the development approach used; the distinction was made by determining whether the ontology development was done or not as the first step of knowledge graph development.

After having selected the articles, the required data was extracted from the articles and analysed in multiple iterations. As a first iteration, the type of the article, the type of the knowledge graph development process, and the process itself were written down. Then, an extensive list of the process’ steps was compiled.

The processes were of different granularity, some including the algorithms and techniques used in the knowledge graph construction as steps, while others only indicated the main phases. The process steps were written out in three different levels, specifying the more generic steps and what they consist of (see Figure 3). Level I steps provide a more generic description of the step, Level II steps specify Level I tasks into smaller stages, while Level III steps are specific and focus on describing the algorithms and techniques used.

Fig. 3. Example of recorded process steps from [87].

3.2 Data Analysis

The overall data analysis workflow is presented in Figure 4 that describes how steps of knowledge graph development were extracted and processed.

Initially, a total of 620 steps of all levels were indicated, of which 519 steps were unique. However, some steps were synonymous to each other; thus, the list was manually amended by changing similar tasks to the same expressions, e.g., relationship extraction was changed to relation extraction; data, data input and similar were changed to data source. After adjusting the synonyms, there were 414 unique values in the final list, of which 182 were of level I, 196 of level II, and 60 of level III. The III level steps were specific, indicating the algorithms and techniques used, thus, were not considered in further analysis. The full list of process steps is available in a dataset repository.²

The frequency of each step was counted to determine the most common steps in the knowledge graph development. This was used as guidance in formulating the general process steps. Additionally, the process figures were extracted from each article, which allowed analysis of how the process is presented visually (Appendix C and dataset repository).

Using information about frequent steps and the visually presented processes, the first process draft of the knowledge graph development was prepared. Then, having these steps, each article was reviewed again in order to record the relevant data per each indicated process step.

Finally, using the described steps of the knowledge graph development process and the visual representations of the processes, the final proposed process was developed and is described in more detail in the following section.

3.3 Evaluation through Case Studies

In order to evaluate the applicability and generalisability of the proposed knowledge graph development process, a comparison to case studies was carried out. The proposed process was compared and mapped to real-life knowledge graphs, and how they are constructed and maintained. The evaluation covers the comparison of two types of knowledge graphs—generic open knowledge graphs and domain-specific knowledge graphs. As a result, this evaluation provided insights on to what extent the proposed process is suitable and relevant to real life examples, as well as possible areas for future work with respect to development lifecycles.

4 RESULTS

The knowledge graph development process based on the review and analysis of the selected articles is presented in Figure 5. The process consists of six main steps: (i) Identify data, (ii) Construct the knowledge graph ontology, (iii) Extract knowledge, (iv) Process knowledge, (v) Construct the knowledge graph, and (vi) Maintain the knowledge graph. The process incorporates both top-down and bottom-up approaches. Each step and its sub-steps are described in the following sections.

Fig. 5. The proposed knowledge graph development process.

4.1 Identify Data

The objective of this step is to identify a domain of interest, a data source, and a way of data acquisition. As mentioned before, knowledge graphs can either be generic or domain-specific [2, 32]. Usually, generic knowledge graphs cover multiple domains and are publicly available, while domain-specific knowledge graphs are for specific domain or problem and commonly used in organisations for their operations. Defining the domain of the knowledge graph allows for better identification of data sources and determine how data can be extracted later [87]. The domain can be as broad or narrow as needed, e.g., education [4, 6, 13, 16, 18, 67, 69, 90], healthcare [34, 47, 52, 78], social media [48, 66], and so on.

Having chosen the domain, it is important to identify the data sources as it influences the overall knowledge graph development process as well as the choice of knowledge extraction techniques. In general, data can be either structured, semi-structured, or unstructured and can be extracted from multiple sources. Structured data is a type of data that has explicit structure, e.g., data in tables or relational databases [82]. Semi-structured data has a certain structure, but it is not strict, e.g., XML data [82]. Unstructured data do not have a predefined structure, e.g., text [82]. For instance, data can be acquired from an online encyclopedia such as Wikipedia (e.g., [28, 83]), a structured database (e.g., [16]), semi-structured documents (e.g., [90]), unstructured text (e.g., [21]), or a mix of several data sources (e.g., [86]).

Finally, the data acquisition methods are chosen based on the type of data and data source. Web resources can be acquired using web crawlers (e.g., [14]), databases can be harvested using data mining techniques (e.g., [85]), and files can be downloaded or accessed directly (e.g., [70]). A suitable method should be chosen considering what data are needed for constructing a knowledge graph.

As the result of this step, the data required for knowledge graph development is acquired and prepared for the knowledge extraction.

4.2 Construct the Knowledge Graph Ontology

The objective of this step is to construct the knowledge graph ontology that provides a top-level structure for the knowledge graph. This step is needed when the top-down approach is used. The top-down approach is usually used either when (i) there is already a clear domain ontology (e.g., medical classification in a healthcare domain [52]) that can be used as a basis for the knowledge graph ontology, or (ii) there is structured data that provides a framework for the ontology to be constructed (e.g., a course syllabus structure in an education domain [6]). Constructing the knowledge graph ontology allows having predefined types of entities and relations between them. For the basis of ontology construction, common ontologies such as FOAF [11], Geonames [81] or others relevant for the domain, as well as common ontology languages such as RDF(S) [73], OWL [72], and XML [74] can be reused.

Ontologies can be constructed manually or automatically. Domain experts can manually develop the ontology, but it is labour intensive. Additionally, it may be complicated to find relevant experts if the domain is narrow [45]. The automatic approach is driven by data and is described in Step 4.2 (see Section 4.4.2).

4.3 Extract Knowledge

Having acquired the data, the next step is to extract knowledge from it. The objective of this step is to extract entities, relations between them and attributes. There are a number of methods to apply for knowledge extraction, and for different types of data, different techniques are needed. Knowledge extraction from semi-structured and unstructured data requires more effort and more complex techniques, while for structured data, entities and relationships are identified more easily.

4.3.1 Extract Entities.

Entity extraction is aimed at discovering and detecting entities in a wide range of data. The objective of this step is both to discover multiple entities for a given type and to identify more informative types for a certain entity [80]. One of the most frequently applied methods is named-entity recognition (NER), which focuses on the discovery and classification of entities to the predefined categories or types [14, 34, 36, 38, 40, 47, 48, 51, 53, 69, 84, 86, 87, 92]. Other machine learning methods also include dictionary-based or pattern-based discovery, sequence labelling, word and entity embeddings, and so on [80].

The quality of extracted entities highly affects the efficiency and quality of knowledge extraction tasks (relations, attributes). Thus, it is a crucial step in knowledge graph development [92].

4.3.2 Extract Relations.

After having extracted entities, they are isolated and not linked together; therefore, it is necessary to extract relations among the entities as well [38]. This step also depends on the type of data. For structured data, relations are explicit and easily identifiable. For semi-structured data, the pattern-based and rule-based approaches can be used as well as other machine learning techniques [80, 87]. In case of unstructured, textual data, relation extraction requires interpreting semantic information, where natural language processing (NLP) methods are commonly used [14, 38, 84], such as semantic role labeling [21, 54, 66, 87] or neural information extraction [80, 87]. Other examples of relation extraction methods include Open Information Extraction (OIE), bootstrapping and distant supervision for automatic labelling, methods based on frame semantics, such as FrameNet [32], kernel methods, and word embeddings [87]. If an ontology is available (as defined in Step 2, Section 4.2), then the relations between extracted entities can be assigned based on the ones defined in the ontology [78].

As a result of this step, having extracted entities and relations allows constructing triples that are used in the knowledge graph.

4.3.3 Extract Attributes.

Attribute extraction refers to acquiring and aggregating the information about a specific entity [48, 51]. In some cases, attribute extraction is seen as the discovery of special types of relations [48] between entities. Nevertheless, the main objective of this step is to describe the entity more clearly [92].

For attribute extraction, similar methods to ones used for relation extraction can be applied, e.g., semantic role labeling [54], or machine learning techniques [80]. In some cases, the type of attribute can be predefined before extracting or gathering the data, e.g., attributes for road signs are colour, shape, and so on [42].

4.4 Process Knowledge

The next step in the process is the processing of knowledge. The objective of this step is to ensure that the knowledge extracted is of high quality. The unprocessed extracted entities, relations and attributes may be ambiguous, redundant or incomplete. Furthermore, knowledge from different sources has to be aligned. Therefore, it is needed to integrate the knowledge, map it to an ontology and complete missing values before constructing the knowledge graph.

4.4.1 Integrate Knowledge.

Knowledge integration, also known as knowledge fusion, refers to integrating knowledge from different sources and cleaning it to eliminate redundancy, contradiction, and ambiguity [45, 48, 62].

First of all, all knowledge should be cleaned by removing unnecessary signs, stop words, and other noise, if there is any [19]. This improves the overall quality of knowledge and prepares the data for entity resolution.

In order to remove duplicates and eliminate ambiguity, it is necessary to perform entity resolution [40, 92] that is also referred to as entity alignment [51, 84, 92], entity canonicalisation [80], and entity matching [38, 92]. The objective of this task is to evaluate if different entities refer to the same real-world objects, and, if so, link them in the knowledge graph. Furthermore, all entities should be linked to unique identifiers (such as URI or IRI) that allow the definition of custom namespaces [55].

Entity resolution involves the tasks of blocking, that is used to cluster similar entities to the blocks, and similarity, that is used to evaluate are there are duplicates in the block [40]. There are a variety of methods to be applied per each task, including traditional blocking, sorted neighbourhoods, canopies for blocking, and machine learning methods for similarity, such as feature vector computation and others [40].

Relations can also be semantically similar, but syntactically different; thus, it is also necessary to merge similar relations and only keep the main ones (e.g., exploit, use, and adopt are similar [19]).

4.4.2 Construct Ontology or Map to it.

If the ontology was not constructed in Step 2 (Section 4.2), then it is recommended to develop it after having integrated knowledge. The ontology in Step 2 defines the structure of the knowledge graph before extracting knowledge, whereas, in this step, the structure of the knowledge graph is defined based on the extracted knowledge.

The ontology of a knowledge graph allows creating a model of how the knowledge graph is represented in a structured way [23] and describes relations between concepts within a domain [33]. It also helps to evaluate the quality of the extracted data and how completed the knowledge is. While constructing the ontology, it is possible to analyse the knowledge graph and identify if the use of domain knowledge is not redundant [70] or predict incomplete ontological triples [36]. Moreover, the construction of the ontology should follow good practices of ontology development [10].

If the ontology was developed in Step 2 and additional knowledge was extracted in Step 3 (Section 4.3), then at this step mapping between the ontology and the extracted knowledge should be done. Thereby, the types of entities and relations should be aligned to the ones defined in the ontology [78]. Additionally, the previously developed ontology can be enriched based on the extracted knowledge [28]. Thus, the ontology of the knowledge graph should be continuously reviewed and updated.

4.4.3 Complete Knowledge.

The objective of this step is to complete and enrich the knowledge in the knowledge graph as well as to improve its overall quality. This includes performing reasoning and inference, validating the triples, and optimising the knowledge graph.

Knowledge reasoning and inference refers to developing and enriching the knowledge graph by establishing new relations among entities based on existing relations and discovering new knowledge from existing knowledge [48, 77]. In general, this can be done by logical inference that is based on the existing rules between relations and through the use of machine-learning (e.g., statistical relational learning or building embedding-based link predictors and node classifiers) [61, 87]. The latter notion also comes under the heading of knowledge graph refinement [61].

The validation of triples allows ensuring that only valid and relevant knowledge is included in the knowledge graph. This can be done by setting integrity and other constraints [23] or setting necessary features for a triple to be considered valid [21]. In addition, a labelling process can be applied to tag triples as valid or not valid [19].

Finally, knowledge graph optimisation can be performed by removing nodes that are not relevant to the domain [88]. This should be based on consistent and logical rules that allow identifying and eliminating conflicts and gaps in the knowledge graph [80].

4.5 Construct the Knowledge Graph

The objective of this step is to ensure that the knowledge graph is accessible and available for use. This includes storing the knowledge graph in a suitable database, displaying and visualising it for exploration, as well as enabling its use.

4.5.1 Store Knowledge Graph.

The knowledge graphs can be stored in various ways due to a wide variety of data models, graph algorithms, and applications [87]. This includes relational databases, key/value stores, triple stores, map/reduce storage [87], and graph databases [32].

Relational databases can be used for storage, even though they may not be the most suitable for large graph management [87]. This type of database can be implemented on top of an existing relational database in the organisation’s infrastructure [77, 87].

Key/value stores are NoSQL database systems that allow improving scalability of knowledge graphs and more flexibility with regard to data types [87].

Triple stores are databases that store knowledge as triples (subject - predicate - object). The majority of triple stores focus on storing knowledge graphs as RDF triples that provide a unified framework for representing information online [87].

Map/reduce storage is used for processing large knowledge graphs, as it divides the number of nodes on different machines, then each machine requires a relatively small size of computation [87].

Graph databases allow for the storage of nodes, edges, and properties of graphs. These databases provide a variety of functionalities for querying and graph mining; however, the update of knowledge can be slow [92]. As an example of a graph database, Neo4j is widely used in the knowledge graph development [14, 15, 22, 33, 38, 49, 51, 53, 56, 70, 86, 88, 90]. It has built-in functionality for, among other things, graph analysis, and querying [22, 53].

4.5.2 Display Knowledge Graph.

Knowledge graphs are useful because they can be not only analysed in the database but also inspected visually. For this, it is necessary to create a knowledge graph visualisation in order to enable analysis, navigation, and discovery of related knowledge [40, 69, 75, 78, 92]. An example of knowledge graph visualisation is presented in Figure 6(a).

Fig. 6. Examples of knowledge graph visualisations.

Some knowledge graph databases have built-in tools for graph visualisation, for example, Neo4j [56]. Another option is to develop the visualisation using front-end tech stacks, for example, using suitable JavaScript libraries [33, 51, 54, 67, 69]. When developing knowledge graph visualisations, it is important to ensure interactivity and follow best practices of information visualisation.

Nevertheless, the display of the visualisation depends on the application of the knowledge graph. For example, Google presents the nodes of Google Knowledge Graph as infoboxes in the search results (Figure 6(b)). Thus, the knowledge graphs can be displayed in multiple ways, and the most suitable one should be chosen considering the intended use of the graph.

4.5.3 Enable Use.

Knowledge graphs can have multiple applications, such as web search [87], question answering [78], recommendation generation [46], chatbot functionalities [46], decision support systems [47], text understanding [80, 87], and so on. The application depends on the purpose of the knowledge graph and the domain. Regardless of the chosen application, it is then necessary to implement tools that enable effective knowledge graph use. The implementation is highly dependent on the required functionality. Furthermore, it is important to consider the end users, what kind of skills they have, and how they are going to use the knowledge graph.

Querying is one of the key functions of knowledge graphs. It allows users to explore and discover knowledge. Query functions can be already built-in in the graph database [90]. For example, Neo4j supports the Cypher graph query language that allows data queries [70]. Other RDF triple stores support SPARQL, which is widely used as the standard query language of knowledge graphs [23, 92]. Querying functionality can also be developed based on specific needs, for example, using knowledge graph matching, distributional semantic matching, or other techniques [78].

4.6 Maintain the Knowledge Graph

As knowledge is constantly changing and evolving, knowledge graphs are never complete. Thus, it is necessary to constantly monitor the knowledge graph, its usage and data sources relevant for the domain, and update the knowledge graphs as needed.

4.6.1 Evaluate the Knowledge Graph.

Besides the evaluation of completeness and quality, which are addressed in Step 4 (Section 4.4), knowledge graphs can be tested through their application by gathering user feedback [91]. By analysing feedback, it is possible to identify gaps in the knowledge graph and set the development directions. This feedback may help identify new data available or provide suggestions on how to improve the application of the knowledge graph, e.g., make it faster or add new functionalities. For this, Step 5 (Section 4.5) has to be repeated by evaluating possible improvement in storage, display, and/or use of the knowledge graph.

4.6.2 Update the Knowledge Graph.

In general, updating the knowledge graph may be needed when (i) there is new data in the data source already used, or (ii) there is a new data source relevant for the knowledge domain [23].

To identify new data in the data source, version and update management is needed both in the data source and in the knowledge graph. Comparing the version and the latest date of update allows easy identification of newly available data. However, this is not always possible, as not all data have version management. In particular, it may be more difficult with unstructured, free text data. Thus, other update mechanisms should be introduced, such as periodical extraction of new knowledge and mapping with current entities and relations [82]. Once new data is identified, the process is repeated from Step 3 (Section 4.3).

The discovery of new relevant data sources is a more complex task. It requires manual research to identify and access new data sources; it can also include legal agreements for data use [23]. Once the new data source is identified, the process is repeated from Step 1 (Section 4.1).

5 CASE STUDIES

In order to evaluate the applicability of the proposed knowledge graph development process, the process is compared to the development of two knowledge graphs—DBpedia as a generic open knowledge graph and the User Experience Practices Knowledge Graph as a domain-specific knowledge graph.

5.1 Comparison to DBpedia

DBpedia is a crowd-sourced open knowledge graph project aimed at extracting structured content from the Wikimedia projects [44]. Data are accessible as Linked Data and through standard Web browsers or automated crawlers.

DBpedia’s development process includes the following steps [20]:

(1)	Definition of mappings and ontology editing;
(2)	Execution of the knowledge extraction process over Wikipedia dumps;
(3)	Parsing and validation of the data against strict rules;
(4)	Release of (intermediate) data artifacts;
(5)	ID management and knowledge fusion from all language editions;
(6)	Deployment of the resulting knowledge graph.

The steps of the proposed knowledge graph development process can be mapped to the DBpedia process as follows:

(1)	Identify data. This step is omitted in DBpedia’s development process as the data source is already identified and clearly defined. As mentioned, DBpedia uses data from various Wikimedia projects. This covers a wide variety of domains, thus, making DBpedia a generic knowledge graph.
(2)	Construct knowledge graph ontology. This step corresponds to the step “Definition of mappings and ontology editing”. DBpedia’s ontology was first developed based on infoboxes within Wikipedia and is continuously updated [44]. Currently, the ontology has over 700 classes and 3,000 properties [20].
(3)	Extract knowledge. This step corresponds to the step “Execution of the knowledge extraction process over Wikipedia dumps”. DBpedia extracts data from Wiki pages through the continuous knowledge extraction process (that is defined by DBpedia Information Extraction Framework (DIEF)) and live extraction, including entities, relations, and attributes extraction. The continuous extraction is performed every month. DBpedia extraction is available through mapping-based (rule-based), generic (automatic), text, and Wikidata extraction [20]. Extract entities. A key method of entity extraction in DBpedia is to extract unmapped information in Wikipedia infoboxes and create entities from the attribute values [20, 44]. Extract relations. Mappings-based extraction is one of the methods used to extract relations from Wikipedia infoboxes [20]. Extract attributes. Other attributes are extracted directly from the article text. Attributes then can be mapped to the existing properties [20].
(4)	Process knowledge. This step corresponds to the steps “Parsing and validation of the data against strict rules” and “ID management and knowledge fusion from all language editions”. Integrate knowledge. At first, the data itself are parsed and validated for early release. Then, it is processed globally, focusing on eliminating redundancy and instability of IRI identifiers. For this process, DBpedia is using FlexiFusion approach that provides flexibility in processing a large variety of data [20]. Construct ontology or map to it. The knowledge is mapped to DBpedia’s ontology [20]. This also allows for the evaluation of the completeness of the extracted knowledge. Complete knowledge. Finally, there are multiple data validation and quality rules applied to ensure the completeness of the RDF triples, for example, reviewing conformance to the predefined schema and ontology restrictions as well as identifying missing artifacts [30].
(5)	Construct knowledge graph. This step corresponds to the steps “Release of (intermediate) data artifacts” and “Deployment of the resulting knowledge graph”. The extracted and processed knowledge is published in an accessible way enabling its use twice—firstly, as intermediate data after strict parsing and validation, and secondly, as a completed knowledge graph [20]. Store the knowledge graph. The data itself are stored in the DBpedia Databus platform. Display the knowledge graph. The Databus platform is accessible online as datasets, also, DBpedia Live can be accessed as an API. DBpedia also exposes human readable representations (i.e., HTML pages) of its knowledge in the form of Linked Data [44] Enable use. Data search is enabled by the DBpedia SPARQL endpoint as well through Linked Data [44]. This allows to users access data and use it for their own needs.
(6)	Maintain knowledge graph. The entire process of DBpedia’s development is iterative and constantly reviewed, which allows capturing the most recent and relevant data. This structured release cycle allows to ensure that the knowledge graph is kept up to date [30]. Evaluate the knowledge graph. Community reviews, contributions, and feedback are used to further develop DBpedia. The knowledge graph and its ontology is widely accessible for users to provide feedback and their suggestions on how the data should be updated [20, 44]. Update the knowledge graph. As the DBpedia is based on Wikpedia data that is constantly changing and updated, DBpedia is also always maintained and updated version is updated in accordance to the release cycle [30].

Overall, DBpedia’s process is similar to the proposed one. Nevertheless, DBpedia’s process steps are specified to better correspond to the operations and procedures, as they are executed in DBpedia. In addition, DBpedia has two stages of processing and releasing data, which allows earlier access to data, even if it is not completed as a knowledge graph.

5.2 Comparison to User Experience Practices Knowledge Graph

UX Methods is a domain-specific boutique knowledge graph aimed at gathering and integrate knowledge related to the user experience design [25].³ Its development workflow is presented in Figure 7 and consists of five main stages—(i) Capture, (ii) Extract Transform Load (ETL), (iii) Semantic Reasoning, (iv) Publication, and (v) Iteration.

Fig. 7. The workflow of the UX Methods Knowledge Graph [24].

The steps of the proposed knowledge graph development process can be mapped to the UX Methods process as follows:

(1)	Identify data. This step corresponds to the step “Capture”. The data are submitted by users using Google Forms in a semi-structured way, providing such information as the method name, description, steps, outcomes, subsequent methods, and available web resources. [25]. Additionally, a headless content management system is used to capture information.
(2)	Construct the knowledge graph ontology. As the ontology of UX Methods is predefined, this step is omitted in the overall workflow. However, the UX Methods uses ontology to describe relationships between different disciplines and methods. It is constantly evolving as new knowledge is added [25].
(3)	Extract knowledge. This step corresponds to the step “ ETL”. The manually captured data are extracted and transformed to RDF, including entities, relations, and attributes. For this purpose, different techniques are used, including auto-classification, semantic data integration, and NLP [24]. Extract entities. Entities are gathered through a headless content management system. Extract relations. The knowledge model provides a set of relation types that are then used to create relations between entities. Extract attributes. Attributes are also gathered through the headless content management system and mapped to the Knowledge Model.
(4)	Process knowledge. This step corresponds to the steps “ ETL” and “Semantic Reasoning”. Integrate knowledge. Newly extracted knowledge can be linked to the existing entities when the data are being updated; however, this is not explicitly explained as the data are manually gathered. Construct ontology or map to it. The extracted knowledge is mapped to the ontology. Complete knowledge. The data are processed by Protégé reasoner that allows for enabling the inference and identifying additional relations, and thus, complete the knowledge graph [24].
(5)	Construct knowledge graph. This step corresponds to the step “Publication” [25]. Store the knowledge graph. The data are stored on Data.world platform [24] Display the knowledge graph. The processed knowledge is published in RDF/XML format and is used to populate the website, allowing users to query and view it [25]. Enable use. Multiple front-end tools are used to provide access and enable use, including, Jekyll and Jekyll-RDF for querying, Lunr.js for implementing the search functionality, and GULP for automating the development workflow [25].
(6)	Maintain knowledge graph. This step corresponds to the step “Iteration”. Evaluate the knowledge graph. The evaluation, recapturing and reintegration of knowledge is performed based on user input, traffic analytics, and search analytics [25]. Update the knowledge graph. With each iteration data are re-integrated in to data models mappings and queries. The knowledge graph relies on the users’ feedback both for the update of the knowledge and maintenance of the knowledge graph itself [25].

Overall, the UX Methods process is similar to the proposed one, as it includes all the identified steps and employs different techniques and algorithms to develop a knowledge graph. However, UX Methods leverages the users’ input, feedback, and interaction to further develop the knowledge graph, whereas this is not captured in the proposed process.

6 DISCUSSION

Based on the case studies, the proposed knowledge graph development process is applicable. The main steps cover the essential development steps and, thus, can be applied in practice. However, there are several considerations as to what extent the proposed process is suitable for use in all cases of knowledge graph development.

Initial vs. continuous development of knowledge graphs. . Based on this systematic review, the research literature focuses on the initial development of knowledge graphs, while the case studies focused on presenting the continuous development of knowledge graphs. In the case studies, initial considerations (such as Step 1 “Identify data”) are done once when establishing a need for a knowledge graph. In addition, in the case studies, the knowledge graph ontology is present, and it is not explained whether it was developed separately or based on the knowledge used in the graph. Therefore, if the proposed process for the existing knowledge graph is used, Step 2 “Construct ontology” is not needed. Whereas, Step 4.2 “Construct ontology or map to it” is performed, focusing on mapping the new data to existing ontology and, if needed, updating the ontology based on the extracted knowledge. While Steps 1 and 2 are essential for determining the scope and structure of the knowledge graph, they are not necessarily revised with each update of the knowledge graph.

The nature of scientific articles also affects the “pipeline-like” visualisation of the proposed process. Since articles focus on presenting how a knowledge graph was developed for a specific case, they commonly do not consider feedback loops and continuous iterations. Thus, more focus is on the initial one-time development, rather than on continuous updates.

Under these considerations, our proposed process appears to be more useful for initial knowledge graph development, where it is necessary to determine the data and the structure of the knowledge graph. We believe that in order to apply this process to existing knowledge graphs would require additional adaption since many of the main decision points have, typically, already been made and the main focus is on acquiring new data and processing it in order to update the graph.

Level of abstraction. . The proposed process aims at providing overall guidance in knowledge graph development. However, the developers (a person or a team responsible for developing the knowledge graph) have to perform additional research and make decisions in order to construct the knowledge graph. Based on various factors, such as the type of data, the choice of algorithms, the type of graph storage, the application of knowledge, and others, the process can differ between knowledge graphs.

The process can be useful as a tool to check if all aspects and considerations are covered. Nevertheless, there is still a need for the developers to choose appropriate methods and algorithms for data acquisition, knowledge extraction and processing, as well as set appropriate measures for maintaining, updating, and managing the knowledge graph (e.g., setting the frequency and procedure for the knowledge graph update).

User perspective. . The reviewed literature does not focus on discussing the role of knowledge graph users in the knowledge graph development process. This may be a result of the fact that research articles are focusing on presenting the most efficient algorithms and how they work rather than on how the knowledge graph will be used once developed.

In contrast, the case studies take into account the user feedback, and how the knowledge graph is used (e.g., traffic or search analytics) for the knowledge graph maintenance and further development. User feedback and analytics can indicate what data are needed to include, how the knowledge graph should be updated, and how the application itself could be improved. Therefore, while the user perspective was not considered in the literature we reviewed, it can be a valuable addition when maintaining a knowledge graph. Positively, we note that the success of Wikidata [71] has led to greater interest in the user and knowledge graph development by the research community [5, 39].

Applying the proposed process. . The proposed process provides a starting point when developing a knowledge graph as well as main steps and areas to consider. It assists in deciding whether a top-down or bottom-up approach should be used as well as planing the work that needs to be done. Nevertheless, the process is generic and requires additional research and decision-making from the individual or team applying this process on what tools and techniques to adopt. There are multiple tools and techniques that can be used in each step of the knowledge graph development process, and they depend on multiple development decisions that were described in the article (Section 4). While some algorithms and methods are mentioned here, there are other resources that describe such methods in detail (e.g., [31, 80]).

The main focus of the reviewed articles is generic or domain-specific knowledge graphs and building them from the beginning rather than on how to improve them. For this reason, the proposed process is better suited for initial knowledge graph development than applying it for an existent knowledge graph improvement. In addition, there are more types of knowledge graphs emerging that were not described in the reviewed articles, for example, personal knowledge graphs [8]. Such knowledge graphs are focused on the user or a person rather than a specific domain. Additionally, for simple knowledge graphs, the proposed process may be too complex and include unnecessary steps.

Lastly, the vast majority of reviewed articles did not base their approach to knowledge graph development on a solid framework but rather described the workflow of their project. The process described here is a syntheses based on the knowledge graph development approaches in the literature. Thus, the described process provides an evidenced-based framework for organising and managing knowledge graph development in a structured manner.

Validity of the research. . While this article achieves its goal of providing a summary of knowledge graph development processes found in multiple articles, several considerations about its validity need to be taken into account. Internal validity is affected by the methodology and research design. The systematic review is highly dependent on the interpretation and biases of the author in the choice of articles, coding, and setting priorities. Moreover, while we believe that our method captured the research base as the most relevant articles in multiple major scientific sources where screened, we cannot guarantee that we retrieved all relevant articles as we applied a threshold and did not perform snowballing due to time constraints. To help ensure validity, the PRISMA guidelines were followed focusing on transparency of the review process. A check list can be found in Appendix B.

In addition, external validity is affected in terms of to what extent results apply to a population. Only scientific articles were analysed, and the evaluation was based on two case studies. While the evaluated case studies show that the proposed process corresponds to actual industry cases, there is not an empirical bases to determine as to what extent the proposed process can be applied and generalised to the whole population. Additional evaluation methods could lead to a broader understanding of its general applicability (e.g., interviews with experts or organisations using the knowledge graphs in their operations). Furthermore, the practical implementation of a knowledge graph could be carried out following the proposed process to examine its efficacy as a guide to knowledge graph development.

7 CONCLUSION

This article aimed at understanding the main steps in the knowledge graph development process and how they are interrelated. This was done through a systematic review and conceptual analysis. The main steps of the development process include: (i) identify data, (ii) construct the knowledge graph ontology, (iii) extract knowledge, (iv) process knowledge, (v) construct the knowledge graph, and (vi) maintain the knowledge graph. The relations between steps are presented in Figure 5. This process suggests a unified approach to knowledge graph development and provides guidance for both researchers and practitioners when constructing and managing knowledge graphs.

There are a number of avenues for future work, including:

Researching additional industry cases. While this research focuses mostly on the development of the knowledge graphs as reported in the literature, a study on how organisations are performing this process would provide further richness to the process in practice.
Evaluating the proposed process with experts and organisations using knowledge graph in their activities. This would allow for a more accurate assessment of the proposed process; its added value and how it can be used in practice.
Examining how existing software development, ontology development or other methodologies in the field of computer science can be applied for knowledge graph development. This article focused on synthesising and analysing knowledge graph development processes. Examining the proposed process by comparing it to other existing methodologies would allow for this extensive literature to be incorporated and compared.
Developing a knowledge graph using the proposed process. This would allow for the evaluation of the practicality and applicability of the proposed process.
Researching tools and techniques for each step of knowledge graph development. While this article is focused on the organising and managing knowledge graph development process, additional research, and mapping of tools and techniques for each step could provide further assistance for researchers and developers.

Overall, we hope this research provides a foundation for further investigation into how software and data engineering methodologies can be used to assist developers and researchers in the construction and maintenance of knowledge graphs.

APPENDICES

A SUMMARY OF ARTICLES

A. 1.

No.	Article	Year	Article type	Process type	Process label
1	Sun K. et al. [69]	2016	Domain specific	Bottom-up	Process
2	Al-Zaidy R. A. et al. [4]	2017	Domain specific	Bottom-up	Pipeline
3	Lian H. et al. [48]	2017	Domain specific	Bottom-up	Process
4	Qui L. et al. [62]	2017	Domain specific	Bottom-up	Process
5	Zhao Y. et al. [91]	2017	Domain specific	Top-down	Aspects
6	Lin Z. Q. et al. [49]	2017	Domain specific	Bottom-up	Overview
7	Xin H. et al. [85]	2018	Domain specific	Top-down	Workflow
8	Yan J. et al. [87]	2018	Methodological	Bottom-up	Framework
9	Chen P. et al. [13]	2018	Domain specific	Bottom-up	Architecture
10	Wang C. et al. [75]	2018	Domain specific	Top-down	Workflow
11	Martinez-Rodriguez J. L. et al. [54]	2018	Methodological	Bottom-up	Method
12	Shekarpour S. et al. [66]	2018	Domain specific	Top-down	Pipeline
13	Zhao Z. et al. [92]	2018	Methodological	Bottom-up	Architecture
14	Yang C. et al. [16]	2018	Domain specific	top-down	Procedure
15	Chenglin Q. et al. [15]	2018	Domain specific	Top-down	Technologies
16	Wu T. et al. [82]	2018	Methodological	Bottom-up	Framework
17	Sharafeldeen D. et al. [65]	2019	Domain specific	Bottom-up	Workflow
18	Mehta A. et al. [55]	2019	Domain specific	Bottom-up	Pipeline
19	Huang L. et al. [34]	2019	Domain specific	Top-down	Process
20	Zhou Y. et al. [94]	2019	Domain specific	Top-down	Framework
21	Hu H. et al. [33]	2019	Domain specific	Bottom-up	Framework
22	Wu T. et al. [83]	2019	Methodological	Bottom-up	Framework
23	Christophides V. et al. [17]	2019	Methodological	Bottom-up	Workflow
24	Kejriwal M. [40]	2019	Methodological	Bottom-up	-
25	Chen H. et al. [12]	2019	Domain specific	Top-down	Framework
26	Wang P. et al. [77]	2019	Domain specific	Bottom-up	Framework
27	Chen Y. et al. [14]	2019	Domain specific	Bottom-up	Framework
28	Yu H. et al. [88]	2020	Domain specific	Bottom-up	Framework
29	Weikum G. et al. [80]	2020	Methodological	Bottom-up	Roadmap
30	Li F. et al. [46]	2020	Domain specific	Top-down	Process
31	Su Y. et al. [67]	2020	Domain specific	Bottom-up	Method
32	Hertling S. et al. [28]	2020	Domain specific	Bottom-up	Workflow
33	Nitisha J. [36]	2020	Domain specific	Top-down	Approach
34	Li L. et al. [47]	2020	Domain specific	Bottom-up	Procedure
35	Mao S. et al. [53]	2020	Domain specific	Bottom-up	Process
36	Kim J. E. et al. [42]	2020	Domain specific	Top-down	Approach
37	Wang Q. et al. [78]	2020	Domain specific	Top-down	Framework
38	Xiao D. et al. [84]	2020	Domain specific	Bottom-up	Method
39	Yu S. et al. [89]	2020	Domain specific	Not clear	Framework
40	Elhammadi S. et al. [21]	2020	Domain specific	Bottom-up	Pipeline
41	Fang W. et al. [22]	2020	Domain specific	Top-down	Workflow
42	Wang M. et al. [76]	2020	Domain specific	Bottom-up	Pipeline
43	Malik K. M. et al. [52]	2020	Domain specific	Bottom-up	Architecture
44	Muhammad I. et al. [56]	2020	Domain specific	Bottom-up	Approach
45	Liu S. et al. [51]	2020	Domain specific	Bottom-up	Framework
46	Jin Y. et al. [38]	2020	Domain specific	Bottom-up	Process
47	Li F. et al. [45]	2020	Methodological	Bottom-up	Flow chart
48	Fensel D. et al. [23]	2020	Methodological	Bottom-up	Process
49	Dessì D. et al. [19]	2020	Domain specific	Bottom-up	Pipeline
50	Aliyu I. et al. [6]	2020	Domain specific	Top-down	Architecture
51	Yan H. et al. [86]	2020	Domain specific	Bottom-up	Process
52	Kim H. [41]	2021	Domain specific	Bottom-up	Process
53	Yu X. et al. [90]	2021	Domain specific	Bottom-up	Process
54	Liu J. et al. [50]	2021	Domain specific	Bottom-up	Workflow
55	Zhou B. et al. [93]	2021	Domain specific	Bottom-up	Framework
56	Dessì D. et al. [18]	2021	Domain specific	Bottom-up	Workflow
57	Tan J. et al. [70]	2021	Domain specific	Top-down	Framework

View Table

A. 1. Summary of Articles used During Systematic Review

B PRISMA 2020 CHECKLIST

Fig. B.1. The PRISMA checklist for this review [60].

C VISUALISATIONS OF THE KNOWLEDGE GRAPH DEVELOPMENT IN THE SELECTED ARTICLES

Figure C.1. Workflow of subjective KB construction [85].

Figure C.2. Framework of the construction method [88].

Figure C.3. The framework of knowledge graph [87].

Figure C.4. Data mining workflow for knowledge graph construction [65].

Figure C.5. The general framework of Chinese knowledge graph construction [82].

Figure C.6. System architecture of K12EduKG [13].

Figure C.7. The architecture of subject KG construction [67].

Figure C.8. Knowledge base construction system [4].

Figure C.9. The overall framework of TCM knowledge graph construction [94].

Figure C.10. Functional design framework [33].

Figure C.11. Proposed framework (OE:Online encyclopedia) [83].

Figure C.12. The generic end-to-end workflow for Entity Resolution [17].

Figure C.13. The overall workflow creating the DBkWik knowledge graph [28].

Figure C.14. First approach for NER for artwork titles [36].

Figure C.15. Technique procedure in this study [75].

Figure C.16. Overview of the proposed method [54].

Figure C.17. Proposed systematic procedure of medical KG construction [47].

Figure C.18. Overall process of analyzing formal concepts from collected specifications and transforming a product knowledge graph [41].

Figure C.19. Construction process of the knowledge graph [53].

Figure C.20. The pipeline of the required steps for developing a knowledge graph of interlined events [66].

Figure C.21. The process of creating this knowledge graph [90].

Figure C.22. COVID-KG overview: from data to semantics to knowledge [78].

Figure C.23. The overview of our method [84].

Figure C.24. The financial knowledge extraction pipeline [21].

Figure C.25. The architecture of knowledge graph [92].

Figure C.26. The workflow of the proposed hybrid semantic computer vision approach [22].

Figure C.27. Building a knowledge graph flowchart [16].

Figure C.28. Constructing process of the visual analysis platform [69].

Figure C.29. Functional view of automated knowledge graph architecture [52].

Figure C.30. The framework for construction for WRKG [93].

Figure C.31. Workflow of the approach for building a scientific knowledge graph from scientific textual resources [18].

Figure C.32. The flowchart of knowledge graph in the domain of culture [79].

Figure C.33. The three-level framework of the ontology-based literatures’ knowledge reasoning network modeling [12].

Figure C.34. Stages involved in the construction of a literature knowledge graph using OIE4KGC [56].

Figure C.35. Construction framework of Chinese ancient historical and cultural knowledge graph [51].

Figure C.36. Research framework for the construction and complement of knowledge graphs in the field of urban traffic [70].

Figure C.37. The united process to construct the graph personal relationships [38].

Figure C.38. Overview of Sogou knowledge graph construction framework [77].

Figure C.39. The process of knowledge graph construction [48].

Figure C.40. Improved flow chart of knowledge graph construction [45].

Figure C.41. The process of constructing Uyghur knowledge graph [62].

Figure C.42. Collaborative development of industrial knowledge graph [91].

Figure C.43. The framework of AgriKG [14].

Figure C.44. Schema of the pipeline to extract and handle entities and relations [19].

Figure C.46. Logical overview of the software knowledge graph construction platform [49].

Figure C.47. The process of built knowledge graph [86].

Footnotes

¹ Note that during screening no relevant articles from 2012 to 2016 were identified and, thus, not included in this review. While there are a number of articles on knowledge graphs in 2012–2016, the main focus of these articles is on technological or theoretical analysis of knowledge instead of presenting the development process.
Footnote
² https://zenodo.org/record/5608878.
Footnote
³ https://github.com/andybywire/ux-methods.
Footnote

REFERENCES

[1] 2012. Introducing the Knowledge Graph: things, not strings. Retrieved 1 March 2021 from https://blog.google/products/search/introducing-knowledge-graph-things-not/.Google Scholar
Reference
[2] Abu-Salih Bilal. 2021. Domain-specific knowledge graphs: A survey. Journal of Network and Computer Applications 185 (72021), 103076. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
[3] Ahmed Tahir, Cox Julian, Girvan Lynda, Paul Alan, Paul Debra, and Thompson Pete. 2014. Developing Information Systems: Practical Guidance for IT Professionals. BCS Learning & Development Ltd, Swindon, UK.Google Scholar
Reference 1Reference 2
[4] Al-Zaidy Rabah A. and Giles C. L.C.. 2017. Automatic knowledge base construction from scholarly documents. In Proceedings of the 2017 ACM Symposium on Document Engineering. Association for Computing Machinery, Inc, New York, NY, 149–152. DOI:Google ScholarDigital Library
Reference
[5] AlGhamdi Kholoud, Shi Miaojing, and Simperl Elena. 2021. Learning to recommend items to wikidata editors. In Proceedings of the International Semantic Web Conference. Springer, 163–181.Google ScholarDigital Library
Reference
[6] Aliyu Ismail, D. Kana A. F., and Aliyu Salisu. 2020. Development of knowledge graph for university courses management. International Journal of Education and Management Engineering 10, 2 (42020), 1–10. DOI:Google ScholarCross Ref
Reference 1Reference 2
[7] Auer Sören, Bizer Christian, Kobilarov Georgi, Lehmann Jens, Cyganiak Richard, and Ives Zachary. 2007. DBpedia: A nucleus for a web of open data. In Proceedings of the 6th International the Semantic Web and 2nd Asian Conference on Asian Semantic Web Conference. Springer-Verlag, Berlin, 722–735.Google ScholarCross Ref
Reference
[8] Balog Krisztian and Kenter Tom. 2019. Personal knowledge graphs: A research agenda. In Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval. Association for Computing Machinery, New York, NY, 217–220. DOI:Google ScholarDigital Library
Reference
[9] Bonatti Piero Andrea, Decker Stefan, Polleres Axel, and Presutti Valentina. 2019. Knowledge graphs: New directions for knowledge representation on the semantic web (Dagstuhl Seminar 18371). Dagstuhl Reports 8, 9 (2019), 29–111. DOI:Google ScholarCross Ref
Reference
[10] Bravo Maricela, Reyes Luis Fernando Hoyos, and Ortiz José A. Reyes. 2019. Methodology for ontology design and construction. Contaduria y Administracion 64, 4 (2019), 1–24. DOI:Google ScholarCross Ref
Reference
[11] Brickley Dan and Miller Libby. 2014. FOAF Vocabulary Specification. Technical Report. FOAF project. Retrieved form http://xmlns.com/foaf/spec/.Google Scholar
Reference
[12] Chen Hainan and Luo Xiaowei. 2019. An automatic literature knowledge graph and reasoning network modeling framework based on ontology and natural language processing. Advanced Engineering Informatics 42 (102019), 100959. DOI:Google ScholarDigital Library
[13] Chen Penghe, Lu Yu, Zheng Vincent W., Chen Xiyang, and Li Xiaoqing. 2018. An automatic knowledge graph construction system for K-12 education. In Proceedings of the 5th Annual ACM Conference on Learning at Scale, L at S 2018. Association for Computing Machinery, New York, NY, 1–4. DOI:Google ScholarDigital Library
Reference
[14] Chen Yuanzhe, Kuang Jun, Cheng Dawei, Zheng Jianbin, Gao Ming, and Zhou Aoying. 2019. AgriKG: An agricultural knowledge graph and its applications. In Proceedings of the Database Systems for Advanced Applications.Springer International Publishing, Cham, 533–537. DOI:Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[15] Chenglin Qi, Qing Song, Pengzhou Zhang, and Hui Yuan. 2018. Cn-MAKG: China meteorology and agriculture knowledge graph construction based on semi-structured data. In Proceedings of the 17th IEEE/ACIS International Conference on Computer and Information Science. Institute of Electrical and Electronics Engineers Inc., 692–696. DOI:Google ScholarCross Ref
Reference
[16] Chi Yang, Qin Yue, Song Rui, and Xu Hao. 2018. Knowledge graph in smart education: A case study of entrepreneurship scientific publication management. Sustainability 10, 4 (32018), 1–21. DOI:Google ScholarCross Ref
Reference 1Reference 2
[17] Christophides Vassilis, Efthymiou Vasilis, Palpanas Themis, Papadakis George, and Stefanidis Kostas. 2021. An overview of end-to-end entity resolution for big data. Computing Surveys 53, 6 (22021), 1–42. DOI:Google ScholarDigital Library
[18] Dessì Danilo, Osborne Francesco, Recupero Diego Reforgiato, Buscaldi Davide, and Motta Enrico. 2021. Generating knowledge graphs by employing natural language processing and machine learning techniques within the scholarly domain. Future Generation Computer Systems 116 (32021), 253–264. DOI:Google ScholarCross Ref
Reference
[19] Dessì Danilo, Osborne Francesco, Recupero Diego Reforgiato, Buscaldi Davide, Motta Enrico, and Sack Harald. 2020. AI-KG: An automatically generated knowledge graph of artificial intelligence. In Proceedings of the Semantic Web – ISWC 2020, Vol. 12507 LNCS. Springer International Publishing, 127–143. DOI:Google ScholarDigital Library
Reference 1Reference 2Reference 3
[20] Dojchinovski Milan, Forberg Jan, Frey Johannes, Hofer Marvin, Streitmatter Denis, and Hellmann Sebastian. 2021. DBpedia Tech Tutorial @ Knowledge Graph Conference 2021. Retrieved 2 May 2021 from https://www.dbpedia.org/blog/dbpedia-tutorial-kgc-2021/.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
Reference 7
Reference 8
Reference 9
Reference 10
[21] Elhammadi Sarah, Lakshmanan Laks V. S., Ng Raymond, Simpson Michael, Huai Baoxing, Wang Zhefeng, and Wang Lanjun. 2020. A high precision pipeline for financial knowledge graph construction. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Stroudsburg, PA, 967–977. DOI:Google ScholarCross Ref
Reference 1Reference 2Reference 3
[22] Fang Weili, Ma Ling, Love Peter E. D., Luo Hanbin, Ding Lieyun, and Zhou Ao. 2020. Knowledge graph for identifying hazards on construction sites: Integrating computer vision with ontology. Automation in Construction 119 (112020), 103310. DOI:Google ScholarCross Ref
Reference 1Reference 2
[23] Fensel Dieter, Şimşek Umutcan, Angele Kevin, Huaman Elwin, Kärle Elias, Panasiuk Oleksandra, Toma Ioan, Umbrich Jürgen, Wahler Alexander, Fensel Dieter, Şimşek Umutcan, Angele Kevin, Huaman Elwin, Kärle Elias, Panasiuk Oleksandra, Toma Ioan, Umbrich Jürgen, and Wahler Alexander. 2020. How to build a knowledge graph. In Proceedings of the Knowledge Graphs: Methodology, Tools and Selected Use Cases. Springer International Publishing, 11–68. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
Reference 7
[24] Fitzgerald Andy. 2021. Case study: A boutique knowledge graph. Retrieved 2 May 2021 from https://medium.com/@andybywire/uxmethods-org-a-boutique-knowledge-graph-case-study-e91af3d2a62.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[25] Fitzgerald Andy. 2021. User Experience Practices Knowledge Graph. Retrieved 2 May 2021 from https://www.uxmethods.org/.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
Reference 7
Reference 8
[26] Gómez-Pérez Asunción, Fernández-López Mariano, and Corcho Oscar. 2006. Ontological Engineering: With Examples from the Areas of Knowledge Management, e-Commerce and the Semantic Web. Springer, London. 195 pages. DOI:Google ScholarCross Ref
Reference 1Reference 2
[27] Gutierrez Claudio and Sequeda Juan F.. 2021. Knowledge graphs. Communications of the ACM 64, 3 (32021), 96–104. DOI:Google ScholarDigital Library
Reference
[28] Hertling Sven and Paulheim Heiko. 2020. DBkWik: Extracting and integrating knowledge from thousands of Wikis. Knowledge and Information Systems 62, 6 (2020). DOI:Google ScholarDigital Library
Reference 1Reference 2
[29] Hitzler Pascal. 2021. A review of the semantic web field. Communications of the ACM 64, 2 (12021), 76–83. DOI:Google ScholarDigital Library
Reference
[30] Hofer Marvin, Hellmann Sebastian, Dojchinovski Milan, and Frey Johannes. 2020. The new DBpedia release cycle: Increasing agility and efficiency in knowledge extraction workflows. In Proceedings of the Semantic Systems. In the Era of Knowledge Graphs, Blomqvist Eva, Groth Paul, Boer Victor de, Pellegrini Tassilo, Alam Mehwish, Käfer Tobias, Kieseberg Peter, Kirrane Sabrina, Meroño-Peñuela Albert, and Pandit Harshvardhan J. (Eds.), Springer International Publishing, Cham, 1–18. DOI:Google ScholarDigital Library
Reference 1Reference 2Reference 3
[31] Hogan Aidan, Blomqvist Eva, Cochez Michael, d’Amato Claudia, Melo Gerard de, Gutierrez Claudio, Kirrane Sabrina, Gayo José Emilio Labra, Navigli Roberto, Neumaier Sebastian, Axel-Cyrille Ngonga Ngomo, Axel Polleres, Sabbir M. Rashid, Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Steffen Staab, and Antoine Zimmermann. 2021. Knowledge graphs. Synthesis Lectures on Data, Semantics, and Knowledge 22 (2021), 1–237, DOI:.Google ScholarCross Ref
Reference
[32] Hogan Aidan, Blomqvist Eva, Cochez Michael, D’amato Claudia, Melo Gerard De, Gutierrez Claudio, Kirrane Sabrina, Gayo José Emilio Labra, Navigli Roberto, Neumaier Sebastian, Ngomo Axel-Cyrille Ngonga, Polleres Axel, Rashid Sabbir M., Rula Anisa, Schmelzeisen Lukas, Sequeda Juan, Staab Steffen, and Zimmermann Antoine. 2021. Knowledge graphs. ACM Computing Surveys 54, 4, Article 71 (July2021), 37 pages. DOI:Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
Reference 7
[33] Hu Huan, Yun Hongyan, He Ying, Zhang Xiuhua, and Yun Yang. 2019. Research and application of semi-automatic construction of structured knowledge graph. In Proceedings of the 2nd International Conference on Big Data Technologies. Association for Computing Machinery, 39–43. DOI:Google ScholarDigital Library
Reference 1Reference 2Reference 3
[34] Huang Lan, Yu Congcong, Chi Yang, Qi Xiaohui, and Xu Hao. 2019. Towards smart healthcare management based on knowledge graph technology. In Proceedings of the 2019 8th International Conference on Software and Computer Applications. Association for Computing Machinery, 330–337. DOI:Google ScholarDigital Library
Reference 1Reference 2
[35] Jabareen Yosef. 2009. Building a conceptual framework: Philosophy, definitions, and procedure. International Journal of Qualitative Methods 8, 4 (2009), 49–62. DOI:Google ScholarCross Ref
Reference
[36] Jain Nitisha. 2020. Domain-specific knowledge graph construction for semantic analysis. In Proceedings of the European Semantic Web Conference. 12124 (2020), 250–260. DOI:Google ScholarDigital Library
Reference 1Reference 2
[37] Ji Shaoxiong, Pan Shirui, Cambria Erik, Marttinen Pekka, and Yu Philip S.. 2021. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems (2021), 1–21. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[38] Jin Yong, Jin Qiao, and Yang Xusheng. 2020. Knowledge graph construction of personal relationships. In Proceedings of the Artificial Intelligence and Security.Springer, Cham, 455–466. DOI:Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[39] Kaffee Lucie-Aimée, Razniewski Simon, and Hogan Aidan (Eds.). 2021. In Proceedings of the 2nd Wikidata Workshop (Wikidata 2021) co-located with the 20th International Semantic Web Conference, Virtual Conference, October 24, 2021. CEUR Workshop Proceedings, Vol. 2982. CEUR-WS.org. Retrieved from http://ceur-ws.org/Vol-2982.Google Scholar
Reference
[40] Kejriwal Mayank. 2019. Domain-Specific Knowledge Graph Construction. Springer International Publishing, Cham. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[41] Kim Haklae. 2021. Developing a product knowledge graph of consumer electronics to manage sustainable product information. Sustainability 13, 4 (22021), 1722. DOI:Google ScholarCross Ref
[42] Kim Ji Eun, Henson Cory, Huang Kevin, Tran Tuan A., and Lin Wan-Yi. 2020. Accelerating Road Sign Ground Truth Construction with Knowledge Graph and Machine Learning. arXiv:2012.02672. Retrieved from https://arxiv.org/abs/2012.02672.Google Scholar
Reference
[43] Kitchenham Barbara and Charters Stuart. 2007. Guidelines for performing Systematic Literature Reviews in Software Engineering. Technical Report. School of Computer Science and Mathematics, Keele University and Department of Computer Science, University of Durham. Retrieved from https://www.elsevier.com/__data/promis_misc/525444systematicreviewsguide.pdf.Google Scholar
Reference 1Reference 2
[44] Lehmann Jens, Isele Robert, Jakob Max, Jentzsch Anja, Kontokostas Dimitris, Mendes Pablo N., Hellmann Sebastian, Morsey Mohamed, Kleef Patrick van, Auer Sören, and Christian Bizer. 2015. Dbpedia–a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web 6, 2 (2015), 167–195. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
Reference 7
[45] Li Fengru, Xie Wei, Wang Xiaowei, and Fan Zhe. 2020. Research on optimization of knowledge graph construction flow chart. In Proceedings of the IEEE 9th Joint International Information Technology and Artificial Intelligence Conference. Institute of Electrical and Electronics Engineers Inc., 1386–1390. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[46] Li Feng Lin, Chen Hehong, Xu Guohai, Qiu Tian, Ji Feng, Zhang Ji, and Chen Haiqing. 2020. AliMeKG: Domain knowledge graph construction and application in e-commerce. In Proceedings of the International Conference on Information and Knowledge Management, Proceedings. Association for Computing Machinery, 2581–2588. DOI:Google ScholarDigital Library
Reference 1Reference 2
[47] Li Linfeng, Wang Peng, Yan Jun, Wang Yao, Li Simin, Jiang Jinpeng, Sun Zhe, Tang Buzhou, Chang Tsung Hui, Wang Shenghui, and Liu Yuting. 2020. Real-world data medical knowledge graph: Construction and applications. Artificial Intelligence in Medicine 103 (32020), 101817. DOI:Google ScholarDigital Library
Reference 1Reference 2Reference 3
[48] Lian Hao, Qin Zemin, He Tieke, and Luo Bin. 2018. Knowledge graph construction based on judicial data with social media. In Proceedings of the 2017 14th Web Information Systems and Applications Conference. Institute of Electrical and Electronics Engineers Inc., 225–227. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
[49] Lin Ze Qi, Xie Bing, Zou Yan Zhen, Zhao Jun Feng, Li Xuan Dong, Wei Jun, Sun Hai Long, and Yin Gang. 2017. Intelligent development environment and software knowledge graph. Journal of Computer Science and Technology 32, 2 (32017), 242–249. DOI:Google ScholarCross Ref
Reference
[50] Liu Jintao, Schmid Felix, Li Keping, and Zheng Wei. 2021. A knowledge graph-based approach for exploring railway operational accidents. Reliability Engineering and System Safety 207 (32021), 107352. DOI:Google ScholarCross Ref
[51] Liu Shuang, Yang Hui, Li Jiayi, and Kolmanič Simon. 2020. Preliminary study on the knowledge graph construction of chinese ancient history and culture. Information 11, 4 (32020), 186. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
[52] Malik Khalid Mahmood, Krishnamurthy Madan, Alobaidi Mazen, Hussain Maqbool, Alam Fakhare, and Malik Ghaus. 2020. Automated domain-specific healthcare knowledge graph curation framework: Subarachnoid hemorrhage as phenotype. Expert Systems with Applications 145 (52020), 113120. DOI:Google ScholarDigital Library
Reference 1Reference 2
[53] Mao Shuai, Zhao Yunmeng, Chen Jinhe, Wang Bing, and Tang Yang. 2020. Development of process safety knowledge graph: A Case study on delayed coking process. Computers and Chemical Engineering 143 (122020), 107094. DOI:Google ScholarCross Ref
Reference 1Reference 2Reference 3
[54] Martinez-Rodriguez Jose L., Lopez-Arevalo Ivan, and Rios-Alvarado Ana B.. 2018. OpenIE-based approach for knowledge graph construction from text. Expert Systems With Applications 113 (2018), 339–355. DOI:Google ScholarDigital Library
Reference 1Reference 2Reference 3
[55] Mehta Aman, Singhal Aashay, and Karlapalem Kamalakar. 2019. Scalable knowledge graph construction over text using deep learning based predicate mapping. In Proceedings of the Web Conference 2019 - Companion of the World Wide Web Conference. Association for Computing Machinery, 705–713. DOI:Google ScholarDigital Library
Reference
[56] Muhammad Iqra, Kearney Anna, Gamble Carrol, Coenen Frans, and Williamson Paula. 2020. Open information extraction for knowledge graph construction. In Proceedings of the Communications in Computer and Information Science. Springer Science and Business Media Deutschland GmbH, 103–113. DOI:Google ScholarCross Ref
Reference 1Reference 2
[57] Nickel Maximilian, Murphy Kevin, Tresp Volker, and Gabrilovich Evgeniy. 2016. A review of relational machine learning for knowledge graphs. In Proceedings of the IEEE. Institute of Electrical and Electronics Engineers Inc., 11–33. DOI:Google ScholarCross Ref
Reference
[58] Noy Natasha, Gao Yuqing, Jain Anshu, Narayanan Anant, Patterson Alan, and Taylor Jamie. 2019. Industry-scale knowledge graphs: Lessons and challenges. Communications of the ACM 62, 8 (2019). DOI:Google ScholarDigital Library
Reference
[59] O’Mara-Eves Alison, Thomas James, McNaught John, Miwa Makoto, and Ananiadou Sophia. 2015. Using text mining for study identification in systematic reviews: A systematic review of current approaches. Systematic Reviews 4, 1 (122015), 5. DOI:Google ScholarCross Ref
Reference
[60] Page Matthew J., Moher David, Bossuyt Patrick M., Boutron Isabelle, Hoffmann Tammy C., Mulrow Cynthia D., Shamseer Larissa, Tetzlaff Jennifer M., Akl Elie A., Brennan Sue E., Chou Roger, Glanville Julie, Grimshaw Jeremy M., Hróbjartsson Asbjørn, Lalu Manoj M., Li Tianjing, Loder Elizabeth W., Mayo-Wilson Evan, McDonald Steve, Mcguinness Luke A., Stewart Lesley A., Thomas James, Tricco Andrea C., Welch Vivian A., Whiting Penny, and Mckenzie Joanne E.. 2021. PRISMA 2020 explanation and elaboration: Updated guidance and exemplars for reporting systematic reviews. BMJ 372 (2021), DOI:Google ScholarCross Ref
Reference
[61] Paulheim Heiko. 2017. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web 8, 3 (2017), 489–508. DOI:Google ScholarDigital Library
Reference 1Reference 2
[62] Qiu Lirong and Zhang Huili. 2017. Review of development and construction of uyghur knowledge graph. In Proceedings of the 2017 IEEE International Conference on Computational Science and Engineering and IEEE/IFIP International Conference on Embedded and Ubiquitous Computing. Institute of Electrical and Electronics Engineers Inc., 894–897. DOI:Google ScholarCross Ref
Reference
[63] Schrader Bess. 2020. What’s the Difference Between an Ontology and a Knowledge Graph? - Enterprise Knowledge. https://enterprise-knowledge.com/whats-the-difference-between-an-ontology-and-a-knowledge-graph/.Google Scholar
Reference
[64] Schreiber Guus. 2000. Knowledge Engineering and Management: The Common KADS Methodology. MIT Press, Cambridge, Massachusetts. 455 pages.Google Scholar
Reference
[65] Sharafeldeen Dina, Algergawy Alsayed, and Konig-Ries Birgitta. 2019. Towards knowledge graph construction using semantic data mining. In Proceedings of the 21st International Conference on Information Integration and Web-Based Applications & Services. Association for Computing Machinery, 323–329. DOI:Google ScholarDigital Library
[66] Shekarpour Saeedeh, Saxena Ankita, Thirunarayan Krishnaprasad, Shalin Valerie L., and Sheth Amit. 2018. Principles for Developing a Knowledge Graph of Interlinked Events from News Headlines on Twitter. arxiv:1808.02022. Retrieved from https://arxiv.org/abs/1808.02022.Google Scholar
Reference 1Reference 2
[67] Su Ying and Zhang Yong. 2020. Automatic construction of subject knowledge graph based on educational big data. In Proceedings of the 2020 the 3rd International Conference on Big Data and Education. Association for Computing Machinery, 30–36. DOI:Google ScholarDigital Library
Reference 1Reference 2
[68] Suchanek Fabian M., Kasneci Gjergji, and Weikum Gerhard. 2007. Yago: A core of semantic knowledge. In Proceedings of the 16th International Conference on World Wide Web. Association for Computing Machinery, New York, NY, 697–706. DOI:Google ScholarDigital Library
Reference
[69] Sun K., Liu Yu-Hua, Guo Zongchao, and Wang C.. 2016. Visualization for knowledge graph based on education data. International Journal of Software and Informatics 10, 3 (2016), 1–13. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[70] Tan Jiyuan, Qiu Qianqian, Guo Weiwei, and Li Tingshuai. 2021. Research on the construction of a knowledge graph and knowledge reasoning model in the field of urban traffic. Sustainability 13, 6 (32021), 3191. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[71] Vrandečić Denny and Krötzsch Markus. 2014. Wikidata: A free collaborative knowledgebase. Communications of the ACM 57, 10 (Sept.2014), 78–85. DOI:Google ScholarDigital Library
Reference 1Reference 2
[72] (W3C) World Wide Web Consortium. 2012. OWL 2 Web Ontology Language Primer (Second Edition). Retrieved 15 April 2021 from https://www.w3.org/TR/owl2-primer/.Google Scholar
Reference
[73] (W3C) World Wide Web Consortium. 2014. RDF Schema 1.1. Retrieved from https://www.w3.org/TR/rdf-schema/.Google Scholar
Reference
[74] (W3C) World Wide Web Consortium. 2014. XML Schema. Retrieved from https://www.w3.org/2001/XMLSchema.Google Scholar
Reference
[75] Wang Chengbin, Ma Xiaogang, Chen Jianguo, and Chen Jingwen. 2018. Information extraction and knowledge graph construction from geoscience literature. Computers and Geosciences 112 (32018), 112–120. DOI:Google ScholarCross Ref
Reference
[76] Wang Meng, Wang Haofen, Qi Guilin, and Zheng Qiushuo. 2020. Richpedia: A large-scale, comprehensive multi-modal knowledge graph. Big Data Research 22 (122020), 100159. DOI:Google ScholarCross Ref
[77] Wang Peilu, Jiang Hao, Xu Jingfang, and Zhang Qi. 2019. Knowledge graph construction and applications for web search and beyond. Data Intelligence 1, 4 (112019), 333–349. DOI:Google ScholarCross Ref
Reference 1Reference 2
[78] Wang Qingyun, Li Manling, Wang Xuan, Parulian Nikolaus, Han Guangxing, Ma Jiawei, Tu Jingxuan, Lin Ying, Zhang Haoran, Liu Weili, Chauhan Aabhas, Guan Yingjun, Li Bangzheng, Li Ruisong, Song Xiangchen, Ji Heng, Han Jiawei, Chang Shih-Fu, Pustejovsky James, Rah Jasmine, Liem David, Elsayed Ahmed, Palmer Martha, Voss Clare, Schneider Cynthia, and Onyshkevych Boyan. 2021. COVID-19 Literature Knowledge Graph Construction and Drug Repurposing Report Generation. arxiv:2007.00576. Retrieved from https://arxiv.org/abs/2007.00576.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
[79] Wei Jingzhu and Liu Rui. 2019. A versatile approach for constructing a domain knowledge graph for culture. In Proceedings of the Association for Information Science and Technology, Vol. 56. John Wiley and Sons Inc, 808–809. DOI:Google ScholarCross Ref
[80] Weikum Gerhard, Dong Luna, Razniewski Simon, and Suchanek Fabian. 2020. Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases. arXiv:2009.11564. Retrieved from https://arxiv.org/abs/2009.11564.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
Reference 7
Reference 8
Reference 9
Reference 10
Reference 11
Reference 12
[81] Wick Marc. 2015. Geonames Ontology. Retrieved 15 April 2021 form https://www.geonames.org/ontology/documentation.html.Google Scholar
Reference
[82] Wu Tianxing, Qi Guilin, Li Cheng, and Wang Meng. 2018. A survey of techniques for constructing chinese knowledge graphs and their applications. Sustainability 10, 9 (92018), 3245. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[83] Wu Tianxing, Wang Haofen, Li Cheng, Qi Guilin, Niu Xing, Wang Meng, Li Lin, and Shi Chaomin. 2020. Knowledge graph construction from multiple online encyclopedias. Artificial Intelligence in Medicine 103 (92020), 101817–101817. DOI:Google ScholarDigital Library
Reference
[84] Xiao Dinghe, Wang Nannan, Yu Jiangang, Zhang Chunhong, and Wu Jiaqi. 2020. A practice of tourism knowledge graph construction based on heterogeneous information. In Proceedings of the 19th Chinese National Conference on Computational Linguistics. Springer, Cham, 159–173. DOI:Google ScholarDigital Library
Reference 1Reference 2Reference 3
[85] Xin Hao, Meng Rui, and Chen Lei. 2018. Subjective knowledge base construction powered by crowdsourcing and knowledge base. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, New York, NY, 1349–1361. DOI:Google ScholarDigital Library
Reference
[86] Yan Hehua, Yang Jun, and Wan Jiafu. 2020. KnowIME: A system to construct a knowledge graph for intelligent manufacturing equipment. IEEE Access 8 (2020), 41805–41813. DOI:Google ScholarCross Ref
Reference 1Reference 2Reference 3
[87] Yan Jihong, Wang Chengyu, Cheng Wenliang, Gao Ming, and Zhou Aoying. 2018. A retrospective of knowledge graphs. Frontiers of Computer Science 12, 1 (22018), 55–74. DOI:Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
Reference 7
Reference 8
Reference 9
Reference 10
Reference 11
Reference 12
Reference 13
Reference 14
Reference 15
Reference 16
Reference 17
Reference 18
Reference 19
Reference 20
Reference 21
[88] Yu Haoze, Li Haisheng, Mao Dianhui, and Cai Qiang. 2020. A domain knowledge graph construction method based on Wikipedia. Journal of Information Science 47, 6 (2020), 1–11. DOI:Google ScholarDigital Library
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[89] Yu Seunghak, He Tianxing, and Glass James. 2020. AutoKG: Constructing Virtual Knowledge Graphs from Unstructured Documents for Question Answering. arxiv:2008.08995. Retrieved from https://arxiv.org/abs/2008.08995.Google Scholar
[90] Yu Xiaobing, Stahr Mike, Chen Han, and Yan Runming. 2021. Design and implementation of curriculum system based on knowledge graph. In Proceedings of the 2021 IEEE International Conference on Consumer Electronics and Computer Engineering. IEEE, Guangzhou, China, 767–770. DOI:Google ScholarCross Ref
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
[91] Zhao Yuanyuan, Liu Quan, and Xu Wenjun. 2018. Open industrial knowledge graph development for intelligent manufacturing service matchmaking. In Proceedings of the 2017 International Conference on Industrial Informatics - Computing Technology, Intelligent Technology, Industrial Information Integration. Institute of Electrical and Electronics Engineers Inc., 194–198. DOI:Google ScholarCross Ref
Reference
[92] Zhao Zhanfang, Han Sung-Kook, and So In-Mi. 2018. Architecture of knowledge graph construction techniques. International Journal of Pure and Applied Mathematics 118, 19 (2018), 1869–1883. Retrieved from https://acadpubl.eu/jsi/2018-118-19/articles/19b/24.pdf.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Reference 5
Reference 6
Reference 7
Reference 8
Reference 9
Reference 10
Reference 11
[93] Zhou Bin, Bao Jinsong, Li Jie, Lu Yuqian, Liu Tianyuan, and Zhang Qiwan. 2021. A novel knowledge graph-based optimization approach for resource allocation in discrete manufacturing workshops. Robotics and Computer-Integrated Manufacturing 71 (102021), 102160. DOI:Google ScholarCross Ref
[94] Zhou Yang, Qi Xingliang, Huang Yi, and Ju Fangning. 2019. Research on construction and application of TCM knowledge graph based on ancient Chinese texts. In Proceedings of the 2019 IEEE/WIC/ACM International Conference on Web Intelligence Workshops, WI 2019 Companion. Association for Computing Machinery, Inc, 144–147. DOI:Google ScholarDigital Library

Index Terms

Defining a Knowledge Graph Development Process Through a Systematic Review

Recommendations

Knowledge graphs: Construction, management and querying
Knowledge Graphs: Construction, Management and Querying
Read More
Topic analysis and development in knowledge graph research: A bibliometric review on three decades
Abstract
Knowledge graph as a research topic is increasingly popular to represent structural relations between entities. Recent years have witnessed the release of various open-source and enterprise-supported knowledge graphs with dramatic ...
Read More
Assisted Knowledge Graph Authoring: Human-Supervised Knowledge Graph Construction from Natural Language
CHIIR '24: Proceedings of the 2024 Conference on Human Information Interaction and Retrieval

Encyclopedic knowledge graphs, such as Wikidata, host an extensive repository of millions of knowledge statements. However, domain-specific knowledge from fields such as history, physics, or medicine is significantly underrepresented in those graphs. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Software Engineering and Methodology Volume 32, Issue 1
January 2023
954 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/3572890
Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology, Switzerland
Issue’s Table of Contents
Copyright © 2023 Copyright held by the owner/author(s).
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 February 2023
- Online AM: 30 April 2022
- Accepted: 24 February 2022
- Revised: 4 February 2022
- Received: 3 August 2021
Published in tosem Volume 32, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Knowledge graphs
knowledge graph construction
development process semantic network
information integration
Qualifiers
- survey
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 10,820
  Total Downloads
- Downloads (Last 12 months)8,868
- Downloads (Last 6 weeks)822
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Defining a Knowledge Graph Development Process Through a Systematic Review

ACM Transactions on Software Engineering and Methodology

Abstract

1 INTRODUCTION

2 RELATED WORK

2.1 Knowledge Graphs

2.2 Trends in Knowledge Graph Development

2.3 Applicability of Existing Development Processes

3 METHODOLOGY

3.1 Data Collection

3.2 Data Analysis

3.3 Evaluation through Case Studies

4 RESULTS

4.1 Identify Data

4.2 Construct the Knowledge Graph Ontology

4.3 Extract Knowledge

4.3.1 Extract Entities.

4.3.2 Extract Relations.

4.3.3 Extract Attributes.

4.4 Process Knowledge

4.4.1 Integrate Knowledge.

4.4.2 Construct Ontology or Map to it.

4.4.3 Complete Knowledge.

4.5 Construct the Knowledge Graph

4.5.1 Store Knowledge Graph.

4.5.2 Display Knowledge Graph.

4.5.3 Enable Use.

4.6 Maintain the Knowledge Graph

4.6.1 Evaluate the Knowledge Graph.

4.6.2 Update the Knowledge Graph.

5 CASE STUDIES

5.1 Comparison to DBpedia

5.2 Comparison to User Experience Practices Knowledge Graph

6 DISCUSSION

7 CONCLUSION

APPENDICES

A SUMMARY OF ARTICLES

B PRISMA 2020 CHECKLIST

C VISUALISATIONS OF THE KNOWLEDGE GRAPH DEVELOPMENT IN THE SELECTED ARTICLES

Footnotes

REFERENCES

Cited By

Index Terms

Recommendations

Knowledge graphs: Construction, management and querying

Topic analysis and development in knowledge graph research: A bibliometric review on three decades

Assisted Knowledge Graph Authoring: Human-Supervised Knowledge Graph Construction from Natural Language

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media