Abstract
In the last two decades, data matching has been addressed for different purposes and in different application contexts, ranging from data integration, to ontology evolution, to semantic data clouding, until more recent exploratory data analysis over large/big datasets. This paper describes the evolution of research activity on matching techniques for data integration and exploration at the ISLab group of the Università degli Studi di Milano. We analyze the matching techniques according to the structure of target data, the algorithmic pattern of the matching process, and the application focus, and we discuss the results of using our techniques for exploratory analysis of a real dataset composed by all the SEBD proceedings publications in the timeframe 1993–2016.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Data have been collected from the DBLP database (http://dblp.org), except for the year 2013 that is missing from DBLP. 1993 data have been collected from the Scopus DB (https://www.scopus.com).
References
C.C. Aggarwal, S.Y. Philip, On clustering massive text and categorical data streams. Knowl. Inf. Syst. 24(2), 171–196 (2010)
P. Berkhin, Grouping multidimensional data, A Survey of Clustering Data Mining Techniques (Springer, Berlin, 2006)
D.M. Blei, A.Y. Ng, M.I. Jordan, Latent dirichlet allocation. J. Mach. Learn. Res. 3(4–5), 993–1022 (2003)
S. Castano, V. De Antonellis, Global viewing of heterogeneous data sources. IEEE Trans. Knowl. Data Eng. 13(2), 277–297 (2001)
S. Castano, A. Ferrara, S. Montanelli, Matching ontologies in open networked systems: techniques and applications. J. Data Semant. V, 25–63 (2006)
S. Castano, A. Ferrara, S. Montanelli, Structured data clouding across multiple webs. Inf. Syst. 37(4), 352–371 (2012)
S. Castano, A. Ferrara, S. Montanelli, Human-in-the-loop web resource classification, in Proceedings of the On the Move to Meaningful Internet Systems: OTM 2016 Conferences (Rhodes, Greece, 2016), pp. 229–244
S. Castano, A. Ferrara, S. Montanelli, Exploratory analysis of textual data streams. Future Gener. Comput. Syst. 68, 391–406 (2017)
A. Ferrara, A. Nikolov, F. Scharffe, Data Linking for the Semantic Web. Semantic Web: Ontology and Knowledge Base Enabled Tools, Services, and Applications 169 (2013)
A. Ferrara, L. Genta, S. Montanelli, S. Castano, Dimensional clustering of linked data: techniques and applications. Trans. Large-Scale Data- Knowl.-Centered Syst. XIX, 55–86 (2015)
A.Y. Halevy, Answering queries using views: a survey. VLDB J. 10(4), 270–294 (2001)
A. Halevy, A. Rajaraman, J. Ordille, Data integration: the teenage years, in Proceedings of the 32nd International Conference on Very Large Data Bases, VLDB Endowment (2006), pp. 9–16
C.D. Manning, P. Raghavan, H. Schütze, Introduction to Information Retrieval, vol. 1 (Cambridge university press Cambridge, Cambridge, 2008)
E. Rahm, P.A. Bernstein, A survey of approaches to automatic schema matching. VLDB J. 10(4), 334–350 (2001)
P. Shvaiko, J. Euzenat, A Survey of Schema-based Matching Approaches. J. Data Semant. IV (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Castano, S., Ferrara, A., Montanelli, S. (2018). Matching Techniques for Data Integration and Exploration: From Databases to Big Data. In: Flesca, S., Greco, S., Masciari, E., Saccà, D. (eds) A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years. Studies in Big Data, vol 31. Springer, Cham. https://doi.org/10.1007/978-3-319-61893-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-61893-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61892-0
Online ISBN: 978-3-319-61893-7
eBook Packages: EngineeringEngineering (R0)