skip to main content
article

Efficient querying of inconsistent databases with binary integer programming

Published:01 April 2013Publication History
Skip Abstract Section

Abstract

An inconsistent database is a database that violates one or more integrity constraints. A typical approach for answering a query over an inconsistent database is to first clean the inconsistent database by transforming it to a consistent one and then apply the query to the consistent database. An alternative and more principled approach, known as consistent query answering, derives the answers to a query over an inconsistent database without changing the database, but by taking into account all possible repairs of the database.

In this paper, we study the problem of consistent query answering over inconsistent databases for the class for conjunctive queries under primary key constraints. We develop a system, called EQUIP, that represents a fundamental departure from existing approaches for computing the consistent answers to queries in this class. At the heart of EQUIP is a technique, based on Binary Integer Programming (BIP), that repeatedly searches for repairs to eliminate candidate consistent answers until no further such candidates can be eliminated. We establish rigorously the correctness of the algorithms behind EQUIP and carry out an extensive experimental investigation that validates the effectiveness of our approach. Specifically, EQUIP exhibits good and stable performance on conjunctive queries under primary key constraints, it significantly outperforms existing systems for computing the consistent answers of such queries in the case in which the consistent answers are not first-order rewritable, and it scales well.

References

  1. M. Arenas, L. E. Bertossi, and J. Chomicki. Consistent query answers in inconsistent databases. In PODS, pages 68-79, 1999. Google ScholarGoogle Scholar
  2. M. Arenas, L. E. Bertossi, and J. Chomicki. Answer sets for consistent query answering in inconsistent databases. TPLP, 3(4-5):393-424, 2003. Google ScholarGoogle Scholar
  3. M. Arenas, L. E. Bertossi, J. Chomicki, X. He, V. Raghavan, and J. Spinrad. Scalar aggregation in inconsistent databases. TCS, 296(3):405-434, 2003. Google ScholarGoogle Scholar
  4. P. Barceló and L. E. Bertossi. Logic programs for querying inconsistent databases. In PADL, pages 208-222, 2003. Google ScholarGoogle Scholar
  5. C. Beeri and R. Ramakrishnan. On the power of magic. In JLP, pages 269-283, 1987. Google ScholarGoogle Scholar
  6. L. Bertossi. Database Repairing and Consistent Query Answering. Morgan and Claypool Publishers, 2011. Google ScholarGoogle Scholar
  7. C. Binnig, D. Kossmann, E. Lo, and M. T. Özsu. QAGen: generating query-aware test databases. In SIGMOD, pages 341-352, 2007. Google ScholarGoogle Scholar
  8. M. Caniupán and L. E. Bertossi. The consistency extractor system: Querying inconsistent databases using answer set programs. In SUM, pages 74-88, 2007. Google ScholarGoogle Scholar
  9. M. Caniupán and L. E. Bertossi. The consistency extractor system: Answer set programs for consistent query answering in databases. DKE, 69(6):545-572, 2010. Google ScholarGoogle Scholar
  10. J. Chomicki and J. Marcinkowski. Minimal-change integrity maintenance using tuple deletions. Inf. Comput., 197(1-2):90-121, 2005. Google ScholarGoogle Scholar
  11. J. Chomicki, J. Marcinkowski, and S. Staworko. Computing consistent query answers using conflict hypergraphs. In CIKM, pages 417-426, 2004. Google ScholarGoogle Scholar
  12. J. Chomicki, J. Marcinkowski, and S. Staworko. Hippo: A system for computing consistent answers to a class of sql queries. In EDBT, pages 841-844, 2004.Google ScholarGoogle Scholar
  13. T. Eiter, W. Faber, C. Koch, N. Leone, and G. Pfeifer. DLV - a system for declarative problem solving. CoRR, cs.AI/0003036, 2000.Google ScholarGoogle Scholar
  14. T. Eiter, M. Fink, G. Greco, and D. Lembo. Efficient evaluation of logic programs for querying data integration systems. In ICLP, pages 163-177, 2003.Google ScholarGoogle Scholar
  15. A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios. Duplicate record detection: A survey. IEEE TKDE, 19(1):1-16, 2007. Google ScholarGoogle Scholar
  16. S. Flesca, F. Furfaro, and F. Parisi. Consistent answers to Boolean aggregate queries under aggregate constraints. In DEXA (2), pages 285-299, 2010. Google ScholarGoogle Scholar
  17. S. Flesca, F. Furfaro, and F. Parisi. Querying and repairing inconsistent numerical databases. ACM TODS, 35(2), 2010. Google ScholarGoogle Scholar
  18. S. Flesca, F. Furfaro, and F. Parisi. Range-consistent answers of aggregate queries under aggregate constraints. In SUM, pages 163-176, 2010. Google ScholarGoogle Scholar
  19. A. Fuxman, E. Fazli, and R. J. Miller. ConQuer: Efficient management of inconsistent databases. In SIGMOD, pages 155-166, 2005. Google ScholarGoogle Scholar
  20. A. Fuxman and R. J. Miller. First-order query rewriting for inconsistent databases. JCSS, 73(4):610-635, 2007. Google ScholarGoogle Scholar
  21. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979. Google ScholarGoogle Scholar
  22. G. Greco, S. Greco, and E. Zumpano. A logic programming approach to the integration, repairing and querying of inconsistent databases. In ICLP, pages 348-364, 2001. Google ScholarGoogle Scholar
  23. G. Greco, S. Greco, and E. Zumpano. A logical framework for querying and repairing inconsistent databases. IEEE TKDE, 15(6):1389-1408, 2003. Google ScholarGoogle Scholar
  24. P. G. Kolaitis and E. Pema. A dichotomy in the complexity of consistent query answering for queries with two atoms. Inf. Process. Lett., 112(3):77-85, 2012. Google ScholarGoogle Scholar
  25. N. Leone, T. Eiter, W. Faber, M. Fink, G. Gottlob, and G. Greco. Boosting information integration: The INFOMIX system. In SEBD, pages 55-66, 2005.Google ScholarGoogle Scholar
  26. D. V. Nieuwenborgh and D. Vermeir. Preferred answer sets for ordered logic programs. TPLP, 6(1-2):107-167, 2006. Google ScholarGoogle Scholar
  27. J. Wijsen. On the consistent rewriting of conjunctive queries under primary key constraints. Inf. Syst., 34(7):578-601, 2009. Google ScholarGoogle Scholar
  28. J. Wijsen. A remark on the complexity of consistent conjunctive query answering under primary key violations. IPL, 110(21):950-955, 2010. Google ScholarGoogle Scholar
  29. J. Wijsen. Certain conjunctive query answering in first-order logic. ACM TODS, 37(2):9, 2012. Google ScholarGoogle Scholar

Index Terms

  1. Efficient querying of inconsistent databases with binary integer programming
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the VLDB Endowment
          Proceedings of the VLDB Endowment  Volume 6, Issue 6
          April 2013
          144 pages

          Publisher

          VLDB Endowment

          Publication History

          • Published: 1 April 2013
          Published in pvldb Volume 6, Issue 6

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader