Abstract
One of the outcomes of the research work carried out on data integration in the last years is a clear architecture, comprising a global schema, the source schema and the mapping between the source and the global schema. In this chapter, we study data integration under this framework when the global schema is specified in OWL, the standard language for the Semantic Web and discuss the impact of this choice on computational complexity of query answering under different instantiations of the framework in terms of query language and form and interpretation of the mapping. We show that query answering in the resulting setting is computationally too complex, and discuss in detail the various sources of complexity. Then, we show how to limit the expressive power of the various components of the framework in order to have efficient query answering, in principle as efficient as query processing in relational DBMSs. In particular, we adopt OWL 2 QL as the ontology language used to express the global schema. OWL 2 QL is one of the tractable profiles of OWL 2, and essentially corresponds to a member of the DL-Lite family, a family of Description Logics designed to have a good trade-off between expressive power of the language and computational complexity of reasoning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., Duschka, O.: Complexity of answering queries using materialized views. In: Proc. of the 17th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS’98), pp. 254–265 (1998)
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995)
Arenas, M., Barcelo, P., Fagin, R., Libkin, L.: Locally consistent transformations and query answering in data exchange. In: Proc. of the 23rd ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2004), pp. 229–240 (2004)
Arenas, M., Bertossi, L.E., Chomicki, J.: Consistent query answers in inconsistent databases. In: Proc. of the 18th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS’99), pp. 68–79 (1999)
Arenas, M., Libkin, L.: XML data exchange: Consistency and query answering. In: Proc. of the 24rd ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2005), pp. 13–24 (2005)
Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M.: DL-Lite in the light of first-order logic. In: Proc. of the 22nd Nat. Conf. on Artificial Intelligence (AAAI 2007), pp. 361–366 (2007)
Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M.: The DL-Lite family and relations. Tech. Rep. BBKCS-09-03, School of Computer Science and Information Systems, Birbeck College, London (2009). Available at http://www.dcs.bbk.ac.uk/research/techreps/2009/bbkcs-09-03.pdf
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook: Theory, Implementation and Applications. Cambridge University Press, Cambridge (2003)
Bernstein, P.A., Giunchiglia, F., Kementsietsidis, A., Mylopoulos, J., Serafini, L., Zaihrayeu, I.: Data management for peer-to-peer computing: A vision. In: Proc. of the 5th Int. Workshop on the Web and Databases (WebDB 2002) (2002)
Bernstein, P.A., Haas, L.: Information integration in the enterprise. Commun. ACM 51(9), 72–79 (2008)
Bouzeghoub, M., Lenzerini, M.: Introduction to the special issue on data extraction, cleaning, and reconciliation. Inf. Syst. 26(8), 535–536 (2001)
Bravo, L., Bertossi, L.: Logic programming for consistently querying data integration systems. In: Proc. of the 18th Int. Joint Conf. on Artificial Intelligence (IJCAI 2003), pp. 10–15 (2003)
Broekstra, J., Klein, M., Fensel, D., Horrocks, I.: Adding formal semantics to the Web: building on top of RDF Schema. In: Proc. of the ECDL 2000 Workshop on the Semantic Web (2000)
Calì, A., Calvanese, D., De Giacomo, G., Lenzerini, M.: Reasoning on UML class diagrams in description logics. In: Proc. of IJCAR Workshop on Precise Modelling and Deduction for Object-oriented Software Development (PMD 2001) (2001)
Calì, A., Calvanese, D., De Giacomo, G., Lenzerini, M.: Data integration under integrity constraints. Inf. Syst. 29, 147–163 (2004)
Calì, A., Lembo, D., Rosati, R.: Query rewriting and answering under constraints in data integration systems. In: Proc. of the 18th Int. Joint Conf. on Artificial Intelligence (IJCAI 2003), pp. 16–21 (2003)
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: What to ask to a peer: Ontology-based query reformulation. In: Proc. of the 9th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR 2004), pp. 469–478 (2004)
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Data complexity of query answering in description logics. In: Proc. of the 10th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR 2006), pp. 260–270 (2006)
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. Autom. Reason. 39(3), 385–429 (2007)
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Inconsistency tolerance in P2P data integration: An epistemic logic approach. Inf. Syst. 33(4), 360–384 (2008)
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Path-based identification constraints in description logics. In: Proc. of the 11th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR 2008), pp. 231–241 (2008)
Calvanese, D., De Giacomo, G., Lenzerini, M.: 2ATAs make DLs easy. In: Proc. of the 2002 Description Logic Workshop (DL 2002), CEUR Electronic Workshop Proceedings, vol. 53, pp. 107–118 (2002). http://ceur-ws.org/
Calvanese, D., De Giacomo, G., Lenzerini, M.: Conjunctive query containment and answering under description logics constraints. ACM Trans. Comput. Log. 9(3), 22.1–22.31 (2008)
Calvanese, D., De Giacomo, G., Lenzerini, M., Nardi, D., Rosati, R.: Description logic framework for information integration. In: Proc. of the 6th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR’98), pp. 2–13 (1998)
Calvanese, D., De Giacomo, G., Lenzerini, M., Nardi, D., Rosati, R.: Data integration in data warehousing. Int. J. Coop. Inf. Syst. 10(3), 237–271 (2001)
Calvanese, D., De Giacomo, G., Lenzerini, M., Rosati, R.: Logical foundations of peer-to-peer data integration. In: Proc. of the 23rd ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2004), pp. 241–251 (2004)
Calvanese, D., De Giacomo, G., Lenzerini, M., Vardi, M.Y.: What is query rewriting? In: Proc. of the 7th Int. Workshop on Knowledge Representation Meets Databases (KRDB 2000), CEUR Electronic Workshop Proceedings, vol. 29, pp. 17–27 (2000). http://ceur-ws.org/
Calvanese, D., De Giacomo, G., Lenzerini, M., Vardi, M.Y.: Rewriting of regular expressions and regular path queries. J. Comput. Syst. Sci. 64(3), 443–465 (2002)
Calvanese, D., De Giacomo, G., Lenzerini, M., Vardi, M.Y.: View-based query processing: On the relationship between rewriting, answering and losslessness. In: Proc. of the 10th Int. Conf. on Database Theory (ICDT 2005). Lecture Notes in Computer Science, vol. 3363, pp. 321–336. Springer, Berlin (2005)
Calvanese, D., De Giacomo, G., Vardi, M.Y.: Decidable containment of recursive queries. Theor. Comput. Sci. 336(1), 33–56 (2005)
Carey, M.J., Haas, L.M., Schwarz, P.M., Arya, M., Cody, W.F., Fagin, R., Flickner, M., Luniewski, A., Niblack, W., Petkovic, D., Thomas, J., Williams, J.H., Wimmers, E.L.: Towards heterogeneous multimedia information systems: The Garlic approach. In: Proc. of the 5th Int. Workshop on Research Issues in Data Engineering—Distributed Object Management (RIDE-DOM’95), pp. 124–131. IEEE Computer Society, Los Alamitos (1995)
Catarci, T., Lenzerini, M.: Representing and using interschema knowledge in cooperative information systems. J. Intell. Coop. Inf. Syst. 2(4), 375–398 (1993)
Chawathe, S.S., Garcia-Molina, H., Hammer, J., Ireland, K., Papakonstantinou, Y., Ullman, J.D., Widom, J.: The TSIMMIS project: Integration of heterogeneous information sources. In: Proc. of the 10th Meeting of the Information Processing Society of Japan (IPSJ’94), pp. 7–18 (1994)
Chen, C., Haarslev, V., Wang, J.: LAS: extending racer by a large ABox store. In: Proc. of the 2005 Description Logic Workshop (DL 2005), CEUR Electronic Workshop Proceedings, vol. 147 (2005). http://ceur-ws.org/
Chomicki, J., Marcinkowski, J., Staworko, S.: Computing consistent query answers using conflict hypergraphs. In: Proc. of the 13th Int. Conf. on Information and Knowledge Management (CIKM 2004), pp. 417–426 (2004)
Chomicki, J., Marcinkowski, J., Staworko, S.: Hippo: a system for computing consistent query answers to a class of SQL queries. In: Proc. of the 9th Int. Conf. on Extending Database Technology (EDBT 2004), pp. 841–844. Springer, Berlin (2004)
De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: On reconciling data exchange, data integration, and peer data management. In: Proc. of the 26th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2007), pp. 133–142 (2007)
De Giacomo, G., Lenzerini, M., Poggi, A., Rosati, R.: On the update of description logic ontologies at the instance level. In: Proc. of the 21st Nat. Conf. on Artificial Intelligence (AAAI 2006), pp. 1271–1276 (2006)
De Giacomo, G., Lenzerini, M., Poggi, A., Rosati, R.: On the approximation of instance level update and erasure in description logics. In: Proc. of the 22nd Nat. Conf. on Artificial Intelligence (AAAI 2007), pp. 403–408 (2007)
Decker, S., Fensel, D., van Harmelen, F., Horrocks, I., Melnik, S., Klein, M., Broekstra, J.: Knowledge representation on the web. In: Proc. of the 2000 Description Logic Workshop (DL 2000), CEUR Electronic Workshop Proceedings, vol. 33, pp. 89–97 (2000). http://ceur-ws.org/
Donini, F.M., Lenzerini, M., Nardi, D., Nutt, W.: The complexity of concept languages. Inf. Comput. 134, 1–58 (1997)
Donini, F.M., Lenzerini, M., Nardi, D., Schaerf, A.: Deduction in concept languages: From subsumption to instance checking. J. Log. Comput. 4(4), 423–452 (1994)
Duschka, O.M., Genesereth, M.R., Levy, A.Y.: Recursive query plans for data integration. J. Log. Program. 43(1), 49–73 (2000)
Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: Semantics and query answering. Theor. Comput. Sci. 336(1), 89–124 (2005)
Fagin, R., Kolaitis, P.G., Popa, L.: Data exchange: Getting to the core. ACM Trans. Database Syst. 30(1), 174–210 (2005)
Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.C.: Composing schema mappings: Second-order dependencies to the rescue. ACM Trans. Database Syst. 30(4), 994–1055 (2005)
Fuxman, A., Fazli, E., Miller, R.J.: ConQuer: Efficient management of inconsistent databases. In: Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pp. 155–166 (2005)
Fuxman, A., Kolaitis, P.G., Miller, R.J., Tan, W.C.: Peer data exchange. ACM Trans. Database Syst. 31(4), 1454–1498 (2005)
Fuxman, A., Miller, R.J.: First-order query rewriting for inconsistent databases. J. Comput. Syst. Sci. 73(4), 610–635 (2007)
Gelfond, M., Lifschitz, V.: The stable model semantics for logic programming. In: Proc. of the 5th Logic Programming Symposium, pp. 1070–1080. MIT Press, Cambridge (1988)
Genereseth, M.R., Keller, A.M., Duschka, O.M.: Infomaster: An information integration system. In: Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pp. 539–542 (1997)
Glimm, B., Horrocks, I., Lutz, C., Sattler, U.: Conjunctive query answering for the description logic \(\mathcal{SHIQ}\) . J. Artif. Intell. Res. 31, 151–198 (2008)
Gottlob, G.: Computing cores for data exchange: New algorithms and practical solutions. In: Proc. of the 24rd ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2005), pp. 148–159 (2005)
Gottlob, G., Nash, A.: Data exchange: Computing cores in polynomial time. In: Proc. of the 25th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2006), pp. 40–49 (2006)
Grahne, G., Mendelzon, A.O.: Tableau techniques for querying information sources through global schemas. In: Proc. of the 7th Int. Conf. on Database Theory (ICDT’99). Lecture Notes in Computer Science, vol. 1540, pp. 332–347. Springer, Berlin (1999)
Greco, G., Greco, S., Zumpano, E.: A logical framework for querying and repairing inconsistent databases. IEEE Trans. Knowl. Data Eng. 15(6), 1389–1408 (2003)
Gribble, S., Halevy, A., Ives, Z., Rodrig, M., Suciu, D.: What can databases do for peer-to-peer? In: Proc. of the 4th Int. Workshop on the Web and Databases (WebDB 2001) (2001)
Grieco, L., Lembo, D., Ruzzi, M., Rosati, R.: Consistent query answering under key and exclusion dependencies: Algorithms and experiments. In: Proc. of the 14th Int. Conf. on Information and Knowledge Management (CIKM 2005), pp. 792–799 (2005)
Gryz, J.: Query rewriting using views in the presence of functional and inclusion dependencies. Inf. Syst. 24(7), 597–612 (1999)
Haarslev, V., Möller, R.: RACER system description. In: Proc. of the Int. Joint Conf. on Automated Reasoning (IJCAR 2001). Lecture Notes in Artificial Intelligence, vol. 2083, pp. 701–705. Springer, Berlin (2001)
Halevy, A., Ives, Z., Suciu, D., Tatarinov, I.: Schema mediation in peer data management systems. In: Proc. of the 19th IEEE Int. Conf. on Data Engineering (ICDE 2003), pp. 505–516 (2003)
Halevy, A.Y.: Answering queries using views: A survey. Very Large Database J. 10(4), 270–294 (2001)
Halevy, A.Y., Rajaraman, A., Ordille, J.: Data integration: The teenage years. In: Proc. of the 32nd Int. Conf. on Very Large Data Bases (VLDB 2006), pp. 9–16 (2006)
Horrocks, I.: Using an expressive description logic: FaCT or fiction? In: Proc. of the 6th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR’98), pp. 636–647 (1998)
Horrocks, I., Kutz, O., Sattler, U.: The even more irresistible \(\mathcal{SROIQ}\) . In: Proc. of the 10th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR 2006), pp. 57–67 (2006)
Horrocks, I., Li, L., Turi, D., Bechhofer, S.: The Instance Store: DL reasoning with large numbers of individuals. In: Proc. of the 2004 Description Logic Workshop (DL 2004), CEUR Electronic Workshop Proceedings, vol. 104 (2004). http://ceur-ws.org/
Hustadt, U., Motik, B., Sattler, U.: Data complexity of reasoning in very expressive description logics. In: Proc. of the 19th Int. Joint Conf. on Artificial Intelligence (IJCAI 2005), pp. 466–471 (2005)
Kolaitis, P.G.: Schema mappings, data exchange, and metadata management. In: Proc. of the 24th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2005), pp. 61–75 (2005)
Krisnadhi, A., Lutz, C.: Data complexity in the \(\mathcal{EL}\) family of description logics. In: Proc. of the 14th Int. Conf. on Logic for Programming, Artificial Intelligence, and Reasoning (LPAR 2007), pp. 333–347 (2007)
Lenzerini, M.: Data integration: A theoretical perspective. In: Proc. of the 21st ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2002), pp. 233–246 (2002)
Leone, N., Eiter, T., Faber, W., Fink, M., Gottlob, G., Greco, G., Kalka, E., Ianni, G., Lembo, D., Lenzerini, M., Lio, V., Nowicki, B., Rosati, R., Ruzzi, M., Staniszkis, W., Terracina, G.: The INFOMIX system for advanced integration of incomplete and inconsistent data. In: Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pp. 915–917 (2005)
Levy, A.Y.: Logic-based techniques in data integration. In: Minker, J. (ed.) Logic Based Artificial Intelligence. Kluwer Academic, Dordrecht (2000)
Levy, A.Y., Mendelzon, A.O., Sagiv, Y., Srivastava, D.: Answering queries using views. In: Proc. of the 14th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS’95), pp. 95–104 (1995)
Levy, A.Y., Rajaraman, A., Ordille, J.J.: Querying heterogenous information sources using source descriptions. In: Proc. of the 22nd Int. Conf. on Very Large Data Bases (VLDB’96) (1996)
Levy, A.Y., Srivastava, D., Kirk, T.: Data model and query evaluation in global information systems. J. Intell. Inf. Syst. 5, 121–143 (1995)
Libkin, L.: Data exchange and incomplete information. In: Proc. of the 25th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS 2006), pp. 60–69 (2006)
Möller, R., Haarslev, V.: Description logic systems. In: Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.F. (eds.) The Description Logic Handbook: Theory, Implementation and Applications, pp. 282–305. Cambridge University Press, Cambridge (2003), Chap. 8
Ortiz, M., Calvanese, D., Eiter, T.: Data complexity of query answering in expressive description logics via tableaux. J. Autom. Reason. 41(1), 61–98 (2008)
Poggi, A., Lembo, D., Calvanese, D., De Giacomo, G., Lenzerini, M., Rosati, R.: Linking data to ontologies. J. Data Semant. X, 133–173 (2008)
Pottinger, R., Levy, A.Y.: A scalable algorithm for answering queries using views. In: Proc. of the 26th Int. Conf. on Very Large Data Bases (VLDB 2000), pp. 484–495 (2000)
Schaerf, A.: On the complexity of the instance checking problem in concept languages with existential quantification. J. Intell. Inf. Syst. 2, 265–278 (1993)
Serafini, L., Ghidini, C.: Using wrapper agents to answer queries in distributed information systems. In: Proc. of the 1st Int. Conf. on Advances in Information Systems (ADVIS-2000). Lecture Notes in Computer Science, vol. 1909. Springer, Berlin (2000)
Sirin, E., Parsia, B.: Pellet system description. In: Proc. of the 2006 Description Logic Workshop (DL 2006), CEUR Electronic Workshop Proceedings, vol. 189 (2006). http://ceur-ws.org/
Sirin, E., Parsia, B., Cuenca Grau, B., Kalyanpur, A., Katz, Y.: Pellet: a practical OWL-DL reasoner. Tech. Rep., University of Maryland Institute for Advanced Computer Studies (UMIACS) (2005)
Ullman, J.D.: Information integration using logical views. Theor. Comput. Sci. 239(2), 189–210 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R., Ruzzi, M. (2010). Using OWL in Data Integration. In: de Virgilio, R., Giunchiglia, F., Tanca, L. (eds) Semantic Web Information Management. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04329-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-04329-1_17
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04328-4
Online ISBN: 978-3-642-04329-1
eBook Packages: Computer ScienceComputer Science (R0)