Skip to main content

Data Exploration Using Example-Based Methods

  • Book
  • © 2019

Overview

Part of the book series: Synthesis Lectures on Data Management (SLDM)

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (8 chapters)

  1. Example-Based Approaches

  2. Open Research Directions

About this book

Data usually comes in a plethora of formats and dimensions, rendering the exploration and information extraction processes challenging. Thus, being able to perform exploratory analyses in the data with the intent of having an immediate glimpse on some of the data properties is becoming crucial. Exploratory analyses should be simple enough to avoid complicate declarative languages (such as SQL) and mechanisms, and at the same time retain the flexibility and expressiveness of such languages. Recently, we have witnessed a rediscovery of the so-called example-based methods, in which the user, or the analyst, circumvents query languages by using examples as input. An example is a representative of the intended results, or in other words, an item from the result set. Example-based methods exploit inherent characteristics of the data to infer the results that the user has in mind, but may not able to (easily) express. They can be useful in cases where a user is looking for information in an unfamiliar dataset, when the task is particularly challenging like finding duplicate items, or simply when they are exploring the data. In this book, we present an excursus over the main methods for exploratory analysis, with a particular focus on example-based methods. We show how that different data types require different techniques, and present algorithms that are specifically designed for relational, textual, and graph data. The book presents also the challenges and the new frontiers of machine learning in online settings which recently attracted the attention of the database community. The lecture concludes with a vision for further research and applications in this area.

Authors and Affiliations

  • Aalborg University, Denmark

    Matteo Lissandrini

  • Aarhus University, Denmark

    Davide Mottin

  • Paris Descartes University, France

    Themis Palpanas

  • University of Trento, Italy

    Yannis Velegrakis

About the authors

Matteo Lissandrini is a postdoctoral researcher at Aalborg University. He received his Ph.D. in Computer Science at the University of Trento, Italy, where he was member of the Data and Information Management (dbTrento) research group. He received his M.Sc. in Computer Science from the university of Trento, Italy, and his B.Sc. Computer Science from the University of Verona, Italy. He has also spent time as a visitor at HP Labs, Palo Alto, California, at the Cheriton School of Computer Science at the University of Waterloo, Canada, and at the Laboratory for Foundations of Computer Science (LFCS) at the University of Edinburgh, United Kingdom. His scientific interests include novel query paradigms for large scale data mining and information extraction with a focus on exploratory search on graph data. He published the first paper on Exemplar Query methods for Knowledge Graphs in VLDB and VLDBJ, and presented the application of such methods in SIGMOD 2014 and VLDB 2018.Davide Mottin is a faculty member at Aarhus University with expertise in graph mining, exploratory methods, and user interaction. Before he was a postdoctoral researcher at Hasso Plattner Institute, leading the graph mining subgroup in the Knowledge Discovery and Data Mining group. He presented graph exploration tutorials in CIKM 2016, SIGMOD 2017, and KDD 2018. He also presented exploratory techniques in KDD 2015, VLDB 2014, and SIGMOD 2015 and is actively engaged in teaching database, big data analytics, and graph mining for Bachelor and Master courses. He is the proponent of exemplar queries paradigm for exploratory analysis [Mottin et al., 2016]. He received his Ph.D. in 2015 from the University of Trento and his thesis was awarded as ""The best of the department in 2015."" He has also visited Yahoo! Labs in Barcelona and Microsoft Research Asia in Beijing.
Themis Palpanas is Senior Member of the French University Institute (IUF), and Professor of computer science at the Paris DescartesUniversity (France), where he is the director of diNo, the data management group. He received the B.Sc. from the National Technical University of Athens, Greece, and the M.Sc. and Ph.D. from the University of Toronto, Canada. He has previously held positions at the IBM T.J. Watson Research Center (U.S.), and the University of Trento (Italy). He has also worked for the University of California at Riverside, and visited Microsoft Research (U.S.) and the IBM Almaden Research Center (U.S.). His research interests include problems related to online and offline data management and data analytics, focusing on exploratory search using knowledge graphs, entity resolution on very large and heterogeneous data, and data series similarity search and analytics. He is the author of nine U.S. patents, three of which have been implemented in world-leading commercial data management products. He is the recipient of three Best Paper awards, and the IBM Shared University Research (SUR) Award. He is serving as Editor in Chief for BDR Journal, Associate Editor for PVLDB 2019, and TKDE journal, and Editorial Advisory Board member for IS journal. He has also served (among others) as General Chair for VLDB 2013, Associate Editor for PVLDB 2017, and Workshop Chair for EDBT 2016.
Yannis Velegrakis is a faculty member in the Department of Information Engineering and Computer Science of the University of Trento, director of the Data Management Group, head of the Data and Knowledge Management Research Program, and coordinator of its EIT Digital Master. His research area of expertise includes Big Data Management, Analytics and Exploration, Knowledge Discovery, Highly Heterogeneous Information Integration, User-centred Querying, Personalisation, Recommendation, Graph Management, and Data Quality. Before joining the University of Trento, he was a researcher at the AT&T Research Labs. He has spent time as a visitor at the IBM Almaden Research Centre (U.S.), the IBM Toronto Lab (Canada), the University of California, Santa-Cruz (U.S.), the University of Paris-Saclay (France), and the Huawei Research Center (Germany). His work has also been recognized through a Marie Curie and a Universite ParissSaclay Jean D'Alembert fellowship. He has been an active member of the database community. He has served as the General Chair for VLDB'13, PC Area chair for ICDE'18, and PC chair in a number of other conferences/workshops.

Bibliographic Information

Publish with us