Synonyms
Definitions
The goal of data integration systems is to provide a uniform access to a set of heterogeneous data sources. These sources can differ on the data model (relational, hierarchical, semi-structured), on the schema level, or on the query-processing capabilities. In a data integration architecture, these sources are queried by using a global schema, also called mediated schema, which provides a virtual view of the underlying sources.
Overview
Integrating data between different sources is a crucial step in many real-life applications, and the growth of structured data sources available on the Web is making this problem even more challenging. Consider as an example a Web application where users can query information about sport events planned in a particular day. In a traditional data management application, the information is stored in a database with a fixed schema (e.g., in a relational data management system) and retrieved by using a query....
This is a preview of subscription content, log in via an institution.
References
Balakrishnan S, Halevy AY, Harb B, Lee H, Madhavan J, Rostamizadeh A, Shen W, Wilder K, Wu F, Yu C (2015) Applying webtables in practice. In: CIDR
Bernstein PA, Madhavan J, Rahm E (2011) Generic schema matching, ten years later. PVLDB 4(11): 695–701
Chakrabarti K, Chaudhuri S, Chen Z, Ganjam K, He Y, Redmond W (2016) Data services leveraging bing’s data assets. IEEE Data Eng Bull 39(3):15–28
Crescenzi V, Mecca G, Merialdo P (2001) Roadrunner: towards automatic data extraction from large web sites. In: VLDB 2001, proceedings of 27th international conference on very large data bases, Roma, 11–14 Sept 2001, pp 109–118
Doan A, Halevy A, Ives Z (2012) Principles of data integration, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco
Franklin M, Halevy A, Maier D (2005) From databases to dataspaces: a new abstraction for information management. ACM SIGMOD Rec 34(4):27–33
Golshan B, Halevy AY, Mihaila GA, Tan W (2017) Data integration: after the teenage years. In: Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI symposium on principles of database systems, PODS 2017, Chicago, 14–19 May 2017, pp 101–106
Halevy AY, Ives ZG, Suciu D, Tatarinov I (2003) Schema mediation in peer data management systems. In: Proceedings 19th international conference on data engineering, 2003. IEEE, pp 505–516
Halevy A, Rajaraman A, Ordille J (2006) Data integration: the teenage years. In: Proceedings of the 32nd international conference on very large data bases, VLDB Endowment, VLDB’06, pp 9–16
Ives ZG, Florescu D, Friedman M, Levy A, Weld DS (1999) An adaptive query execution system for data integration. In: Proceedings of the 1999 ACM SIGMOD international conference on management of data, SIGMOD’99. ACM, New York, pp 299–310. https://doi.org/10.1145/304182.304209
Ives ZG, Halevy AY, Weld DS (2004) Adapting to source properties in processing data integration queries. In: Proceedings of the 2004 ACM SIGMOD international conference on management of data, SIGMOD’04, Paris, 13–18 June 2004. ACM, New York, pp 395–406. https://doi.org/10.1145/1007568.1007613
Lenzerini M (2002) Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, PODS’02. ACM, New York, pp 233–246. https://doi.org/10.1145/543613.543644
Liu L, Zsu MT (2009) Encyclopedia of database systems, 1st edn. Springer, Incorporated, New York/London
Popa L, Velegrakis Y, Miller RJ, Hernández MA, Fagin R (2002) Translating web data. In: Proceedings of 28th international conference on very large data bases, VLDB 2002, Hong Kong, 20–23 Aug 2002, pp 598–609
Pottinger R, Halevy A (2001) Minicon: a scalable algorithm for answering queries using views. VLDB J 10(2–3):182–198
Tatarinov I, Ives Z, Madhavan J, Halevy A, Suciu D, Dalvi N, Dong XL, Kadiyska Y, Miklau G, Mork P (2003) The piazza peer data management project. ACM SIGMOD Rec 32(3):47–52
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this entry
Cite this entry
Papotti, P., Santoro, D. (2018). Data Integration. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_6-1
Download citation
DOI: https://doi.org/10.1007/978-3-319-63962-8_6-1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63962-8
Online ISBN: 978-3-319-63962-8
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering
Publish with us
Chapter history
-
Latest
Data Integration- Published:
- 16 March 2022
DOI: https://doi.org/10.1007/978-3-319-63962-8_6-2
-
Original
Data Integration- Published:
- 15 March 2018
DOI: https://doi.org/10.1007/978-3-319-63962-8_6-1