The Second Open Knowledge Extraction Challenge

Nuzzolese, Andrea Giovanni; Gentile, Anna Lisa; Presutti, Valentina; Gangemi, Aldo; Meusel, Robert; Paulheim, Heiko

doi:10.1007/978-3-319-46565-4_1

Andrea Giovanni Nuzzolese¹⁴,
Anna Lisa Gentile¹⁵,
Valentina Presutti¹⁴,
Aldo Gangemi^14,16,
Robert Meusel¹⁵ &
…
Heiko Paulheim¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 641))

Included in the following conference series:

Semantic Web Evaluation Challenge

701 Accesses
2 Citations

Abstract

The Open Knowledge Extraction (OKE) challenge, at its second edition, has the ambition to provide a reference framework for research on Knowledge Extraction from text for the Semantic Web by re-defining a number of tasks (typically from information and knowledge extraction), taking into account specific SW requirements. The OKE challenge defines two tasks: (1) Entity Recognition, Linking and Typing for Knowledge Base population; (2) Class Induction and entity typing for Vocabulary and Knowledge Base enrichment. Task 1 consists of identifying Entities in a sentence and create an OWL individual representing it, link to a reference KB (DBpedia) when possible and assigning a type to such individual. Task 2 consists in producing rdf:type statements, given definition texts. The participants will be given a dataset of sentences, each defining an entity (known a priori). The following systems participated to the challenge: WestLab to both Task 1 and 2, ADEL and Mannheim to Task 2 only. In this paper we describe the OKE challenge, the tasks, the datasets used for training and evaluating the systems, the evaluation method, and obtained results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.itl.nist.gov/iad/mig/publications/proceedings/darpa99/html/ie5/ie5.htm.
2.
http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_proceedings/overview.html.
3.
https://www.ldc.upenn.edu/collaborations/past-projects/ace/annotation-tasks-and-specifications.
4.
http://www.nist.gov/tac/tracks/index.html.
5.
http://www.nist.gov/tac/2015/KBP.
6.
http://trec-kba.org/.
7.
http://stlab.istc.cnr.it/stlab/WikipediaOntology/.
8.
The prefix dul: stands for the namespace http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#.
9.
http://persistence.uni-leipzig.org/nlp2rdf/.
10.
The prefixes nif:, itsrdf:, dul:, and dbpedia: identify the namespaces http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#, http://www.w3.org/2005/11/its/rdf#, http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#, and http://dbpedia.org/resource/ respectively.
11.
Prefixes d0: and dul: stand for namespaces http://ontologydesignpatterns.org/ont/wikipedia/d0.owl# and http://www.ontologydesignpatterns.org/ont/dul/DUL.owl# respectively.
12.
A preview of the job can be found at https://tasks.crowdflower.com/channels/cf_internal/jobs/913913/editor_preview.
13.
https://crowdflower.com.
14.
The training dataset is available at https://github.com/anuzzolese/oke-challenge-2016/blob/master/GoldStandard_sampleData/task1/dataset_task_1.ttl. Similarly, the evaluation dataset is available at https://github.com/anuzzolese/oke-challenge-2016/blob/master/evaluation-data/task1/evaluation-dataset-task1.ttl.
15.
The training dataset is available at https://github.com/anuzzolese/oke-challenge-2016/blob/master/GoldStandard_sampleData/task2/dataset_task_2.ttl. Similarly, the evaluation dataset is available at https://github.com/anuzzolese/oke-challenge-2016/blob/master/evaluation-data/task2/evaluation-dataset-task2.ttl.
16.
https://github.com/anuzzolese/oke-challenge-2016.

References

Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Semant. Web Inf. Syst. 5(3), 1–22 (2009)
Article Google Scholar
Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia - a crystallization point for the web of data. J. Web Semant. 7(3), 154–165 (2009)
Article Google Scholar
Chabchoub, M., Gagnon, M., Zouaq, A.: Collective disambiguation and semantic annotation for entity linking and typing. In: Sack et al. [14]
Google Scholar
Doddington, G.R., Mitchell, A., Przybocki, M.A., Ramshaw, L.A., Strassel, S., Weischedel, R.M.: The automatic content extraction (ACE) program-tasks, data, and evaluation. In: LREC (2004)
Google Scholar
Faralli, S., Ponzetto, S.P.: Open knowledge extraction challenge a hearst- like pattern-based approach to hypernym extraction and class induction. In: Sack et al. [14] (2016)
Google Scholar
Gangemi, A., Guarino, N., Masolo, C., Oltramari, A., Schneider, L.: Sweetening ontologies with DOLCE. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 166–181. Springer, Heidelberg (2002)
Chapter Google Scholar
Grishman, R., Sundheim, B.: Message understanding conference-6: a brief history. In: Proceedings of 16th Conference on Computational Linguistics - COLING 1996, vol. 1, pp. 466–471. Association for Computational Linguistics, Stroudsburg (1996)
Google Scholar
Haidar-Ahmad, L., Font, L., Zouaq, A., Gagnon, M.: Entity typing and linking using sparql patterns and DBpedia. In: Sack et al. [14]
Google Scholar
Hellmann, S., Lehmann, J., Auer, S., Brümmer, M.: Integrating NLP using linked data. In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 98–113. Springer, Heidelberg (2013)
Chapter Google Scholar
Nuzzolese, A.G., Gentile, A.L., Presutti, V., Gangemi, A., Garigliotti, D., Navigli, R.: Open knowledge extraction challenge. In: Gandon, F., Cabrio, E., Stankovic, M., Zimmermann, A. (eds.) SemWebEval 2015. CCIS, vol. 548, pp. 3–15. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25518-7_1
Chapter Google Scholar
Petasis, G., Karkaletsis, V., Paliouras, G., Krithara, A., Zavitsanos, E.: Ontology population and enrichment: state of the art. In: Paliouras, G., Spyropoulos, C.D., Tsatsaronis, G. (eds.) Bridging the Semantic Gap. LNCS, vol. 6050, pp. 134–166. Springer, Heidelberg (2011)
Chapter Google Scholar
Plu, J., Rizzo, G., Troncy, R.: Enhancing entity linking by combining models. In: Sack et al. [14]
Google Scholar
Röder, M., Usbeck, R., Speck, R., Ngomo, A.-C.N.: CETUS – a baseline approach to type extraction. In: Gandon, F., Cabrio, E., Stankovic, M., Zimmermann, A. (eds.) SemWebEval 2015. CCIS, vol. 548, pp. 16–27. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25518-7_2
Chapter Google Scholar
Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.): The Semantic Web: ESWC Challenges, Communications in Computer and Information Science. Springer, Berlin (2016)
Google Scholar
Tjong Kim Sang, E.F., Introduction to the CoNLL- shared task: language-independent named entity recognition. In: Proceedings of 6th Conference on Natural Language Learning - COLING-2002, vol. 20, pp. 1–4. Association for Computational Linguistics, Stroudsburg (2002)
Google Scholar
Iordache, O.: Introduction. In: Iordache, O. (ed.) Polystochastic Models for Complexity. UCS, vol. 4, pp. 1–16. Springer, Heidelberg (2010)
Chapter Google Scholar
Usbeck, R., Röder, M., Ngomo, A.N., Baron, C., Both, A., Brümmer, M., Ceccarelli, D., Cornolti, M., Cherix, D., Eickmann, B., Ferragina, P., Lemke, C., Moro, A., Navigli, R., Piccinno, F., Rizzo, G., Sack, H., Speck, R., Troncy, R., Waitelonis, J., Wesemann, L.: GERBIL: general entity annotator benchmarking framework. In: Gangemi, A., Leonardi, S., Panconesi, A. (eds.) Proceedings of 24th International Conference on World Wide Web, WWW 2015, pp. 1133–1143. ACM (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Semantic Technology Laboratory, ISTC-CNR, Rome, Italy
Andrea Giovanni Nuzzolese, Valentina Presutti & Aldo Gangemi
Data and Web Science Group, University of Mannheim, Mannheim, Germany
Anna Lisa Gentile, Robert Meusel & Heiko Paulheim
LIPN, UMR CNRS, Université Paris 13, Sorbone Cité, Paris, France
Aldo Gangemi

Authors

Andrea Giovanni Nuzzolese
View author publications
You can also search for this author in PubMed Google Scholar
Anna Lisa Gentile
View author publications
You can also search for this author in PubMed Google Scholar
Valentina Presutti
View author publications
You can also search for this author in PubMed Google Scholar
Aldo Gangemi
View author publications
You can also search for this author in PubMed Google Scholar
Robert Meusel
View author publications
You can also search for this author in PubMed Google Scholar
Heiko Paulheim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrea Giovanni Nuzzolese .

Editor information

Editors and Affiliations

IT Systems Engineering, Hasso-Plattner Institute, Potsdam, Germany
Harald Sack
Leibniz Universität Hannover , Hannover, Germany
Stefan Dietze
Elsevier B.V. , Amsterdem, The Netherlands
Anna Tordai
Universität Bonn , Bonn, Germany
Christoph Lange

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nuzzolese, A.G., Gentile, A.L., Presutti, V., Gangemi, A., Meusel, R., Paulheim, H. (2016). The Second Open Knowledge Extraction Challenge. In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds) Semantic Web Challenges. SemWebEval 2016. Communications in Computer and Information Science, vol 641. Springer, Cham. https://doi.org/10.1007/978-3-319-46565-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-46565-4_1
Published: 09 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46564-7
Online ISBN: 978-3-319-46565-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics