Abstract
An ontology is a conceptual representation of a domain resulted from a consensus within a community. One of its main applications is the integration of heterogeneous information sources available in the Web, by means of the semantic annotation of web documents. This is the cornerstone of the emerging Semantic Web. However, nowadays most of the information in the Web consists of text documents with little or no structure at all, which makes impracticable their manual annotation. This paper addresses the problem of mapping text fragments into a given ontology in order to generate ontology instances that semantically describe this kind of resources. As a result, applying this mapping we can automatically populate a Semantic Web consisting of text documents that concern with a specific ontology. We have evaluated our approach over a real-application ontology and a text collection both in the Archeology domain. Results show the effectiveness of the method as well as its usefulness.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American (2001)
Gruber, T.R.: Towards Principles for the Design of Ontologies used for Knowledge Sharing. International Journal of Human-Computer Studies 43, 907–928 (1995)
Forno, F., Farinetti, L., Mehan, S.: Can Data Mining Techniques Ease The Semantic Tagging Burden? In: SWDB 2003, pp. 277–292 (2003)
Doan, A., et al.: Learning to match ontologies on the Semantic Web. VLDB Journal 12(4), 303–319 (2003)
Appelt, D.: Introduction to Information Extraction. AI Communications 12 (1999)
Maedche, A., Neumann, G., Staab, S.: Bootstrapping an Ontology based Information Extraction System. Studies in Fuzziness and Soft Computing. Springer, Heidelberg (2001)
Danger, R., Ruíz-Shulcloper, J., Berlanga, R.: Text Mining using the Hierarchical Structure of Documents. In: Conejo, R., Urretavizcaya, M., Pérez-de-la-Cruz, J.-L. (eds.) CAEPIA/TTIA 2003. LNCS (LNAI), vol. 3040, Springer, Heidelberg (2004) (in Press)
Dirección General del Patrimonio Artístico, http://www.cult.gva.es/dgpa/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Danger, R., Sanz, I., Berlanga-Llavori, R., Ruiz-Shulcloper, J. (2004). A Proposal for the Automatic Generation of Instances from Unstructured Text. In: Sanfeliu, A., Martínez Trinidad, J.F., Carrasco Ochoa, J.A. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2004. Lecture Notes in Computer Science, vol 3287. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30463-0_58
Download citation
DOI: https://doi.org/10.1007/978-3-540-30463-0_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23527-9
Online ISBN: 978-3-540-30463-0
eBook Packages: Springer Book Archive