Abstract
An inconsistent database is a database that violates one or more integrity constraints. A typical approach for answering a query over an inconsistent database is to first clean the inconsistent database by transforming it to a consistent one and then apply the query to the consistent database. An alternative and more principled approach, known as consistent query answering, derives the answers to a query over an inconsistent database without changing the database, but by taking into account all possible repairs of the database.
In this paper, we study the problem of consistent query answering over inconsistent databases for the class for conjunctive queries under primary key constraints. We develop a system, called EQUIP, that represents a fundamental departure from existing approaches for computing the consistent answers to queries in this class. At the heart of EQUIP is a technique, based on Binary Integer Programming (BIP), that repeatedly searches for repairs to eliminate candidate consistent answers until no further such candidates can be eliminated. We establish rigorously the correctness of the algorithms behind EQUIP and carry out an extensive experimental investigation that validates the effectiveness of our approach. Specifically, EQUIP exhibits good and stable performance on conjunctive queries under primary key constraints, it significantly outperforms existing systems for computing the consistent answers of such queries in the case in which the consistent answers are not first-order rewritable, and it scales well.
- M. Arenas, L. E. Bertossi, and J. Chomicki. Consistent query answers in inconsistent databases. In PODS, pages 68-79, 1999. Google Scholar
- M. Arenas, L. E. Bertossi, and J. Chomicki. Answer sets for consistent query answering in inconsistent databases. TPLP, 3(4-5):393-424, 2003. Google Scholar
- M. Arenas, L. E. Bertossi, J. Chomicki, X. He, V. Raghavan, and J. Spinrad. Scalar aggregation in inconsistent databases. TCS, 296(3):405-434, 2003. Google Scholar
- P. Barceló and L. E. Bertossi. Logic programs for querying inconsistent databases. In PADL, pages 208-222, 2003. Google Scholar
- C. Beeri and R. Ramakrishnan. On the power of magic. In JLP, pages 269-283, 1987. Google Scholar
- L. Bertossi. Database Repairing and Consistent Query Answering. Morgan and Claypool Publishers, 2011. Google Scholar
- C. Binnig, D. Kossmann, E. Lo, and M. T. Özsu. QAGen: generating query-aware test databases. In SIGMOD, pages 341-352, 2007. Google Scholar
- M. Caniupán and L. E. Bertossi. The consistency extractor system: Querying inconsistent databases using answer set programs. In SUM, pages 74-88, 2007. Google Scholar
- M. Caniupán and L. E. Bertossi. The consistency extractor system: Answer set programs for consistent query answering in databases. DKE, 69(6):545-572, 2010. Google Scholar
- J. Chomicki and J. Marcinkowski. Minimal-change integrity maintenance using tuple deletions. Inf. Comput., 197(1-2):90-121, 2005. Google Scholar
- J. Chomicki, J. Marcinkowski, and S. Staworko. Computing consistent query answers using conflict hypergraphs. In CIKM, pages 417-426, 2004. Google Scholar
- J. Chomicki, J. Marcinkowski, and S. Staworko. Hippo: A system for computing consistent answers to a class of sql queries. In EDBT, pages 841-844, 2004.Google Scholar
- T. Eiter, W. Faber, C. Koch, N. Leone, and G. Pfeifer. DLV - a system for declarative problem solving. CoRR, cs.AI/0003036, 2000.Google Scholar
- T. Eiter, M. Fink, G. Greco, and D. Lembo. Efficient evaluation of logic programs for querying data integration systems. In ICLP, pages 163-177, 2003.Google Scholar
- A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios. Duplicate record detection: A survey. IEEE TKDE, 19(1):1-16, 2007. Google Scholar
- S. Flesca, F. Furfaro, and F. Parisi. Consistent answers to Boolean aggregate queries under aggregate constraints. In DEXA (2), pages 285-299, 2010. Google Scholar
- S. Flesca, F. Furfaro, and F. Parisi. Querying and repairing inconsistent numerical databases. ACM TODS, 35(2), 2010. Google Scholar
- S. Flesca, F. Furfaro, and F. Parisi. Range-consistent answers of aggregate queries under aggregate constraints. In SUM, pages 163-176, 2010. Google Scholar
- A. Fuxman, E. Fazli, and R. J. Miller. ConQuer: Efficient management of inconsistent databases. In SIGMOD, pages 155-166, 2005. Google Scholar
- A. Fuxman and R. J. Miller. First-order query rewriting for inconsistent databases. JCSS, 73(4):610-635, 2007. Google Scholar
- M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979. Google Scholar
- G. Greco, S. Greco, and E. Zumpano. A logic programming approach to the integration, repairing and querying of inconsistent databases. In ICLP, pages 348-364, 2001. Google Scholar
- G. Greco, S. Greco, and E. Zumpano. A logical framework for querying and repairing inconsistent databases. IEEE TKDE, 15(6):1389-1408, 2003. Google Scholar
- P. G. Kolaitis and E. Pema. A dichotomy in the complexity of consistent query answering for queries with two atoms. Inf. Process. Lett., 112(3):77-85, 2012. Google Scholar
- N. Leone, T. Eiter, W. Faber, M. Fink, G. Gottlob, and G. Greco. Boosting information integration: The INFOMIX system. In SEBD, pages 55-66, 2005.Google Scholar
- D. V. Nieuwenborgh and D. Vermeir. Preferred answer sets for ordered logic programs. TPLP, 6(1-2):107-167, 2006. Google Scholar
- J. Wijsen. On the consistent rewriting of conjunctive queries under primary key constraints. Inf. Syst., 34(7):578-601, 2009. Google Scholar
- J. Wijsen. A remark on the complexity of consistent conjunctive query answering under primary key violations. IPL, 110(21):950-955, 2010. Google Scholar
- J. Wijsen. Certain conjunctive query answering in first-order logic. ACM TODS, 37(2):9, 2012. Google Scholar
Index Terms
- Efficient querying of inconsistent databases with binary integer programming
Recommendations
CAvSAT: Answering Aggregation Queries over Inconsistent Databases via SAT Solving
SIGMOD '21: Proceedings of the 2021 International Conference on Management of DataConsistent Query Answering (CQA) is a rigorous and principled approach to answering queries posed against inconsistent databases. Computing consistent answers to a Select-Project-Join (SPJ) query or an SPJ query with aggregation operators on a given ...
Probabilistic query answering over inconsistent databases
This paper presents a framework for querying inconsistent databases in the presence of functional dependencies. Most of the works dealing with the problem of extracting reliable information from inconsistent databases are based on the notion of repair, ...
Querying and repairing inconsistent numerical databases
The problem of extracting consistent information from relational databases violating integrity constraints on numerical data is addressed. In particular, aggregate constraints defined as linear inequalities on aggregate-sum queries on input data are ...
Comments