Abstract
With the advance of the Semantic Web, the need to integrate and combine data from different sources has increased considerably. Many efforts have focused on RDF document matching. However, they present limited approaches in the context of datatype similarity. This paper addresses the issue of datatype similarity for the Semantic Web as a first step towards a better RDF document matching. We propose a datatype hierarchy, based on W3C’s XSD datatype hierarchy, that better captures the subsumption relationship among primitive and derived datatypes. We also propose a new datatype similarity measure, that takes into consideration several aspects related to the new hierarchical relations between compared datatypes. Our experiments show that the new similarity measure, along with the new hierarchy, produces better results (closer to what a human expert would think about the similarity of compared datatypes) than the ones described in the literature.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Internationalized Resource Identifier. An extension of URIs that allows characters from the Unicode character set.
- 2.
A range (rdfs:range) defines the object type that is associated to a property.
- 3.
Constraining facets are sets of aspects that can be used to constrain the values of simple types (https://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#rf-facets).
- 4.
An attribute is the minimum classification of data, which does not subsume another one. For example, datatype date has the attributes year, month, and day.
- 5.
It is the most specific common ancestor of two concepts\(\slash \)nodes, found in a given taxonomy\(\slash \)hierarchy.
- 6.
We show the results according the measure proposed on [15], all other works in Group 4 propose similar measures.
- 7.
Results are available: http://cloud.sigappfr.org/index.php/s/yRRbUQUeHs0NJnW.
- 8.
By experimentation, we determined this value as the optimal one.
References
RDF 1.1 Semantics, W3C Recommendation 25 February 2014. https://www.w3.org/TR/rdf11-mt/#literals-and-datatypes
XML Schema Datatypes in RDF and OWL, W3C Working Group Note 14 March 2006. https://www.w3.org/TR/swbp-xsch-datatypes/#sec-values
Al-Bakri, M., Fairbairn, D.: Assessing similarity matching for possible integration of feature classifications of geospatial data from official and informal sources. Int. J. Geogr. Inf. Sci. 26(8), 1437–1456 (2012)
Algergawy, A., Nayak, R., Saake, G.: XML schema element similarity measures: a schema matching context. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2009. LNCS, vol. 5871, pp. 1246–1253. Springer, Heidelberg (2009). doi:10.1007/978-3-642-05151-7_36
Algergawy, A., Nayak, R., Saake, G.: Element similarity measures in xml schema matching. Inf. Sci. 180(24), 4975–4998 (2010)
Algergawy, A., Schallehn, E., Saake, G.: A sequence-based ontology matching approach. In: Proceedings of European Conference on Artificial Intelligence, Workshop on Contexts and Ontologies, pp. 26–30 (2008)
Algergawy, A., Schallehn, E., Saake, G.: Improving XML schema matching performance using prufer sequences. Data Knowl. Eng. 68(8), 728–747 (2009)
Amarintrarak, N., Runapongsa, S., Tongsima, S., Wiwatwattana, N.: SAXM: semi-automatic xml schema mapping. In: Proceedings of International Technical Conference on Circuits/Systems, Computers and Communications, pp. 374–377 (2009)
Bernstein, P.A., Madhavan, J., Rahm, E.: Generic schema matching with cupid. Technical report MSR-TR-2001-58, pp. 1–14. Microsoft Research(2001)
Cruz, I.F., Antonelli, F.P., Stroe, C.: Agreementmaker: efficient matching for large real-world schemas and ontologies. Proc. VLDB 2(2), 1586–1589 (2009)
Do, H.-H., Rahm, E.: Coma: a system for flexible combination of schema matching approaches. In: Proceedings of VLDB, pp. 610–621 (2002)
Eidoon, Z., Yazdani, N., Oroumchian, F.: Ontology matching using vector space. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 472–481. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78646-7_45
Euzenat, J., Shvaiko, P. (eds.): Ontology Matching, vol. 18. Springer-Verlag New York Inc., New York (2007)
Hanif, M.S., Aono, M.: An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. J. Web Semant. 7(4), 344–356 (2009)
Hong-Minh, T., Smith, D.: Hierarchical approach for datatype matching in xml schemas. In: 24th British National Conference on Databases, pp. 120–129 (2007)
Hu, W., Qu, Y., Cheng, G.: Matching large ontologies: a divide-and-conquer approach. Data Knowl. Eng. 67(1), 140–160 (2008)
Jean-Mary, Y.R., Shironoshita, E.P., Kabuka, M.R.: Ontology matching with semantic verification. Web Semant. 7(3), 235–251 (2009)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of Conference on Research in Computational Linguistics, pp. 1–15 (1997)
Jiang, S., Lowd, D., Dou, D.: Ontology matching with knowledge rules. CoRR, abs/1507.03097 (2015)
Lambrix, P., Tan, H.: Sambo-a system for aligning and merging biomedical ontologies. Web Semant. 4(3), 196–206 (2006)
Li, J., Tang, J., Li, Y., Luo, Q.: RiMOM: a dynamic multistrategy ontology alignment framework. Trans. Knowl. Data Eng. 21(8), 1218–1232 (2009)
Mukkala, L., Arvo, J., Lehtonen, T., Knuutila, T., et al.: Current state of ontology matching. A survey of ontology and schema matching. Technical report 4, University of Turku, pp. 1–18 (2015)
Nayak, R., Tran, T.: A progressive clustering algorithm to group the XML data by structural and semantic similarity. Int. J. Pattern Recogn. Artif. Intell. 21(04), 723–743 (2007)
Nayak, R., Xia, F.B.: Automatic integration of heterogenous XML-schemas. In: Proceedings of Information Integration and Web Based Appslications & Services, pp. 1–10 (2004)
Ngo, D., Bellahsene, Z.: Overview of YAM++(not) yet another matcher for ontology alignment task. Web Semant.: Sci. Serv. Agents WWW 41, 30–49 (2016)
Stoilos, G., Stamou, G., Kollias, S.: A string metric for ontology alignment. In: Proceedings of International Conference on the SW, pp. 624–637 (2005)
Thang, H.Q., Nam, V.S.: Xml schema automatic matching solution. Comput. Electr. Autom. Control Inf. Eng. 4(3), 456–462 (2010)
Thuy, P.T., Lee, Y.-K., Lee, S.: Semantic and structural similarities between XML schemas for integration of ubiquitous healthcare data. Pers. Ubiquitous Comput. 17(7), 1331–1339 (2013)
Acknowledgments
This work has been partly supported by FINCyT/ INOVATE PERU - Convenio No. 104-FINCyT-BDE-2014.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Dongo, I., Al Khalil, F., Chbeir, R., Cardinale, Y. (2017). Semantic Web Datatype Similarity: Towards Better RDF Document Matching. In: Benslimane, D., Damiani, E., Grosky, W., Hameurlain, A., Sheth, A., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2017. Lecture Notes in Computer Science(), vol 10438. Springer, Cham. https://doi.org/10.1007/978-3-319-64468-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-64468-4_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64467-7
Online ISBN: 978-3-319-64468-4
eBook Packages: Computer ScienceComputer Science (R0)