Abstract
Due to the explosion of information on the Web, there is a need to structure Web data in order to make it accessible to both users and machines. E-commerce is one of the areas in which increasing data volume on the Web has serious consequences. This paper proposes a framework that populates tabular product information from Web shops in a product ontology. By formalizing product information in this way, one can make better product comparison or recommender applications on the Web. Our approach makes use of lexical and syntactic matching techniques for mapping properties and instantiating values. The performed evaluation shows that instantiating TVs and MP3 players from two popular Web shops, Best Buy and Newegg.com, results in an F1 score of 95.07% for property mapping and 76.60% for value instantiation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aanen, S.S., Nederstigt, L.J., Vandić, D., Frăsincar, F.: SCHEMA - an algorithm for automated product taxonomy mapping in E-commerce. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 300–314. Springer, Heidelberg (2012)
de Bakker, M., Frasincar, F., Vandic, D.: A Hybrid Model Words-Driven Approach for Web Product Duplicate Detection. In: Salinesi, C., Norrie, M.C., Pastor, Ó. (eds.) CAiSE 2013. LNCS, vol. 7908, pp. 149–161. Springer, Heidelberg (2013)
de Bakker, M., Frasincar, F., Vandic, D., Kaymak, U.: Model Words-Driven Approaches for Duplicate Detection on the Web. In: 28th Symposium On Applied Computing (SAC 2013), pp. 717–723. ACM (2013)
Berrueta, D., Polo, L.: MUO — An Ontology to Represent Units of Measurement in RDF (2009), http://goo.gl/Gzyz2a
Bing, Google, Yahoo! and Yandex: schema.org (2014), http://schema.org
Celjuska, D., Vargas-Vera, M.: Ontosophie: A Semi-automatic System for Ontology Population from Text. In: 3rd International Conference on Natural Language Processing, ICON 2004 (2004)
Chang, C., Kayed, M., Girgis, R., Shaalan, K.: A Survey of Web Information Extraction Systems. IEEE Transactions on Knowledge and Data Engineering 18(10), 1411–1428 (2006)
Google: Knowledge Graph (2014), http://goo.gl/wgswGe
Guarino, N., Welty, C.: Evaluating ontological decisions with OntoClean. Communications of the ACM 45(2), 61–65 (2002)
Hepp, M.: GoodRelations: An Ontology for Describing Products and Services Offers on the Web. In: Gangemi, A., Euzenat, J. (eds.) EKAW 2008. LNCS (LNAI), vol. 5268, pp. 329–346. Springer, Heidelberg (2008)
Holzinger, W., Krüpl, B., Herzog, M.: Using Ontologies for Extracting Product Features from Web Pages. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 286–299. Springer, Heidelberg (2006)
Nederstigt, L.J., Aanen, S.S., Vandić, D., Frăsincar, F.: An automatic approach for mapping product taxonomies in E-commerce systems. In: Ralyté, J., Franch, X., Brinkkemper, S., Wrycza, S. (eds.) CAiSE 2012. LNCS, vol. 7328, pp. 334–349. Springer, Heidelberg (2012)
Patel, C., Supekar, K., Lee, Y.: Ontogenie: Extracting Ontology Instances from WWW. In: Workshop on Human Language Technology for the Semantic Web and Web Services, Springer (2003)
Sucharita Mulpuru: US eCommerce Grows, Reaching $414B by 2018, but Physical Stores Will Live On (2014), http://goo.gl/Y3gyVI
Vandic, D., van Dam, J.W., Frasincar, F.: Faceted Product Search Powered by the Semantic Web. Decision Support Systems 53(3), 425–437 (2012)
VijayaLakshmi, B., GauthamiLatha, A., Srinivas, D.Y., Rajesh, K.: Perspectives of Semantic Web in E- Commerce. International Journal of Computer Applications 25(10), 52–56 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Vandic, D., Nederstigt, L.J., Aanen, S.S. (2014). Ontology Population from Web Product Information. In: Indulska, M., Purao, S. (eds) Advances in Conceptual Modeling. ER 2014. Lecture Notes in Computer Science, vol 8823. Springer, Cham. https://doi.org/10.1007/978-3-319-12256-4_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-12256-4_28
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12255-7
Online ISBN: 978-3-319-12256-4
eBook Packages: Computer ScienceComputer Science (R0)