Skip to main content

Semantic Web Datatype Similarity: Towards Better RDF Document Matching

  • Conference paper
  • First Online:
  • 1081 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10438))

Abstract

With the advance of the Semantic Web, the need to integrate and combine data from different sources has increased considerably. Many efforts have focused on RDF document matching. However, they present limited approaches in the context of datatype similarity. This paper addresses the issue of datatype similarity for the Semantic Web as a first step towards a better RDF document matching. We propose a datatype hierarchy, based on W3C’s XSD datatype hierarchy, that better captures the subsumption relationship among primitive and derived datatypes. We also propose a new datatype similarity measure, that takes into consideration several aspects related to the new hierarchical relations between compared datatypes. Our experiments show that the new similarity measure, along with the new hierarchy, produces better results (closer to what a human expert would think about the similarity of compared datatypes) than the ones described in the literature.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Internationalized Resource Identifier. An extension of URIs that allows characters from the Unicode character set.

  2. 2.

    A range (rdfs:range) defines the object type that is associated to a property.

  3. 3.

    Constraining facets are sets of aspects that can be used to constrain the values of simple types (https://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#rf-facets).

  4. 4.

    An attribute is the minimum classification of data, which does not subsume another one. For example, datatype date has the attributes year, month, and day.

  5. 5.

    It is the most specific common ancestor of two concepts\(\slash \)nodes, found in a given taxonomy\(\slash \)hierarchy.

  6. 6.

    We show the results according the measure proposed on [15], all other works in Group 4 propose similar measures.

  7. 7.

    Results are available: http://cloud.sigappfr.org/index.php/s/yRRbUQUeHs0NJnW.

  8. 8.

    By experimentation, we determined this value as the optimal one.

References

  1. RDF 1.1 Semantics, W3C Recommendation 25 February 2014. https://www.w3.org/TR/rdf11-mt/#literals-and-datatypes

  2. XML Schema Datatypes in RDF and OWL, W3C Working Group Note 14 March 2006. https://www.w3.org/TR/swbp-xsch-datatypes/#sec-values

  3. Al-Bakri, M., Fairbairn, D.: Assessing similarity matching for possible integration of feature classifications of geospatial data from official and informal sources. Int. J. Geogr. Inf. Sci. 26(8), 1437–1456 (2012)

    Article  Google Scholar 

  4. Algergawy, A., Nayak, R., Saake, G.: XML schema element similarity measures: a schema matching context. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2009. LNCS, vol. 5871, pp. 1246–1253. Springer, Heidelberg (2009). doi:10.1007/978-3-642-05151-7_36

    Chapter  Google Scholar 

  5. Algergawy, A., Nayak, R., Saake, G.: Element similarity measures in xml schema matching. Inf. Sci. 180(24), 4975–4998 (2010)

    Article  Google Scholar 

  6. Algergawy, A., Schallehn, E., Saake, G.: A sequence-based ontology matching approach. In: Proceedings of European Conference on Artificial Intelligence, Workshop on Contexts and Ontologies, pp. 26–30 (2008)

    Google Scholar 

  7. Algergawy, A., Schallehn, E., Saake, G.: Improving XML schema matching performance using prufer sequences. Data Knowl. Eng. 68(8), 728–747 (2009)

    Article  Google Scholar 

  8. Amarintrarak, N., Runapongsa, S., Tongsima, S., Wiwatwattana, N.: SAXM: semi-automatic xml schema mapping. In: Proceedings of International Technical Conference on Circuits/Systems, Computers and Communications, pp. 374–377 (2009)

    Google Scholar 

  9. Bernstein, P.A., Madhavan, J., Rahm, E.: Generic schema matching with cupid. Technical report MSR-TR-2001-58, pp. 1–14. Microsoft Research(2001)

    Google Scholar 

  10. Cruz, I.F., Antonelli, F.P., Stroe, C.: Agreementmaker: efficient matching for large real-world schemas and ontologies. Proc. VLDB 2(2), 1586–1589 (2009)

    Article  Google Scholar 

  11. Do, H.-H., Rahm, E.: Coma: a system for flexible combination of schema matching approaches. In: Proceedings of VLDB, pp. 610–621 (2002)

    Google Scholar 

  12. Eidoon, Z., Yazdani, N., Oroumchian, F.: Ontology matching using vector space. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 472–481. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78646-7_45

    Chapter  Google Scholar 

  13. Euzenat, J., Shvaiko, P. (eds.): Ontology Matching, vol. 18. Springer-Verlag New York Inc., New York (2007)

    MATH  Google Scholar 

  14. Hanif, M.S., Aono, M.: An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. J. Web Semant. 7(4), 344–356 (2009)

    Article  Google Scholar 

  15. Hong-Minh, T., Smith, D.: Hierarchical approach for datatype matching in xml schemas. In: 24th British National Conference on Databases, pp. 120–129 (2007)

    Google Scholar 

  16. Hu, W., Qu, Y., Cheng, G.: Matching large ontologies: a divide-and-conquer approach. Data Knowl. Eng. 67(1), 140–160 (2008)

    Article  Google Scholar 

  17. Jean-Mary, Y.R., Shironoshita, E.P., Kabuka, M.R.: Ontology matching with semantic verification. Web Semant. 7(3), 235–251 (2009)

    Article  Google Scholar 

  18. Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of Conference on Research in Computational Linguistics, pp. 1–15 (1997)

    Google Scholar 

  19. Jiang, S., Lowd, D., Dou, D.: Ontology matching with knowledge rules. CoRR, abs/1507.03097 (2015)

    Google Scholar 

  20. Lambrix, P., Tan, H.: Sambo-a system for aligning and merging biomedical ontologies. Web Semant. 4(3), 196–206 (2006)

    Article  Google Scholar 

  21. Li, J., Tang, J., Li, Y., Luo, Q.: RiMOM: a dynamic multistrategy ontology alignment framework. Trans. Knowl. Data Eng. 21(8), 1218–1232 (2009)

    Article  Google Scholar 

  22. Mukkala, L., Arvo, J., Lehtonen, T., Knuutila, T., et al.: Current state of ontology matching. A survey of ontology and schema matching. Technical report 4, University of Turku, pp. 1–18 (2015)

    Google Scholar 

  23. Nayak, R., Tran, T.: A progressive clustering algorithm to group the XML data by structural and semantic similarity. Int. J. Pattern Recogn. Artif. Intell. 21(04), 723–743 (2007)

    Article  Google Scholar 

  24. Nayak, R., Xia, F.B.: Automatic integration of heterogenous XML-schemas. In: Proceedings of Information Integration and Web Based Appslications & Services, pp. 1–10 (2004)

    Google Scholar 

  25. Ngo, D., Bellahsene, Z.: Overview of YAM++(not) yet another matcher for ontology alignment task. Web Semant.: Sci. Serv. Agents WWW 41, 30–49 (2016)

    Article  Google Scholar 

  26. Stoilos, G., Stamou, G., Kollias, S.: A string metric for ontology alignment. In: Proceedings of International Conference on the SW, pp. 624–637 (2005)

    Google Scholar 

  27. Thang, H.Q., Nam, V.S.: Xml schema automatic matching solution. Comput. Electr. Autom. Control Inf. Eng. 4(3), 456–462 (2010)

    Google Scholar 

  28. Thuy, P.T., Lee, Y.-K., Lee, S.: Semantic and structural similarities between XML schemas for integration of ubiquitous healthcare data. Pers. Ubiquitous Comput. 17(7), 1331–1339 (2013)

    Article  Google Scholar 

Download references

Acknowledgments

This work has been partly supported by FINCyT/ INOVATE PERU - Convenio No. 104-FINCyT-BDE-2014.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irvin Dongo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Dongo, I., Al Khalil, F., Chbeir, R., Cardinale, Y. (2017). Semantic Web Datatype Similarity: Towards Better RDF Document Matching. In: Benslimane, D., Damiani, E., Grosky, W., Hameurlain, A., Sheth, A., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2017. Lecture Notes in Computer Science(), vol 10438. Springer, Cham. https://doi.org/10.1007/978-3-319-64468-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-64468-4_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-64467-7

  • Online ISBN: 978-3-319-64468-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics