Abstract
Traditional semantic data models, such as the Entity Relationship (ER) data model, are used to represent real world semantics that are crucial for the effective management of structured data. The semantics that can be expressed in the ER data model include the representation of entity types together with their identifiers and attributes, n-ary relationship types together with their participating entity types and attributes, and functional dependencies among the participating entity types of relationship types and their attributes, etc.
Today, semistructured data has become more prevalent on the Web, and XML has become the de facto standard for semi-structured data. A DTD and an XML Schema of an XML document only reflect the hierarchical structure of the semistructured data stored in the XML document. The hierarchical structures of XML documents are captured by the relationships between an element and its attributes, and between an element and its subelements. Elementattribute relationships do not have clear semantics, and the relationships between elements and their subelements are binary. The semantics of n-ary relationships with n > 2 cannot be represented or captured correctly and precisely in DTD and XML Schema. Many of the crucial semantics captured by the ER model for structured data are not captured by either DTD or XML Schema. We present the problems encountered in order to correctly and efficiently store, query, and transform (view) XML documents without knowing these important semantics. We solve these problems by using a semantic-rich data model called the Object, Relationship, Attribute data model for SemiStructured Data (ORA-SS). We briefly describe how to mine such important semantics from given XML documents.
Chapter PDF
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ling, T.W. (2006). Capturing Semantics in XML Documents. In: Nayak, R., Zaki, M.J. (eds) Knowledge Discovery from XML Documents. KDXD 2006. Lecture Notes in Computer Science, vol 3915. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11730262_2
Download citation
DOI: https://doi.org/10.1007/11730262_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33180-3
Online ISBN: 978-3-540-33181-0
eBook Packages: Computer ScienceComputer Science (R0)