skip to main content
research-article

Bit-vector algorithms for binary constraint satisfaction and subgraph isomorphism

Published:07 February 2011Publication History
Skip Abstract Section

Abstract

A solution to a binary constraint satisfaction problem is a set of discrete values, one in each of a given set of domains, subject to constraints that allow only prescribed pairs of values in specified pairs of domains. Solutions are sought by backtrack search interleaved with a process that removes from domains those values that are currently inconsistent with provisional choices already made in the course of search. For each value in a given domain, a bit-vector shows which values in another domain are or are not permitted in a solution. Bit-vector representation of constraints allows bit-parallel, therefore fast, operations for editing domains during search. This article revises and updates bit-vector algorithms published in the 1970's, and introduces focus search, which is a new bit-vector algorithm relying more on search and less on domain-editing than previous algorithms. Focus search is competitive within a limited family of constraint satisfaction problems.

Determination of subgraph isomorphism is a specialized binary constraint satisfaction problem for which bit-vector algorithms have been widely used since the 1980s, particularly for matching molecular structures. This article very substantially updates the author's 1976 subgraph isomorphism algorithm, and reports experimental results with random and real-life data.

Skip Supplemental Material Section

Supplemental Material

References

  1. Aardal, K. I., van Hoesel, S. P. M., Koster, A. M. C. A., Mannino, C., and Sassano, A. 2007. Models and solution techniques for frequency assignment problems. Ann. Oper. Res. 153, 1, 79--129.Google ScholarGoogle ScholarCross RefCross Ref
  2. Artymiuk, P. J., Spriggs, R. V., and Willett, P. 2005. Graph theoretic methods for the analysis of structural relationships in biological macromolecules. J. Am. Soc. Inf. Sci. Technol. 56, 5, 518--528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bandyopadhyay, D., Huan, J., Prins, J., Snoeyink, J., Wang, W., and Tropsha, A. 2009. Identification of family-specific residue packing motifs and their use for structure-based protein function prediction: I. method development. J. Comput. Aided Mol. Des. 23, 11, 773--784.Google ScholarGoogle ScholarCross RefCross Ref
  4. Bessière, C. 2006. Constraint Propagation. Handbook of Constraint Propagation. Elsevier, New York, http://www.lirmm.fr/~bessiere/stock/TR06020.pdf.Google ScholarGoogle Scholar
  5. Bessiére, C., Meseguer, P., Freuder, E. C., and Larrosa, J. 2002. On forward checking for non-binary constraint satisfaction. Artif. Intell. 141, 1, 205--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bessiére, C., Régin, J.-C., Yap, R. H. C., and Zhang, Y. 2005. An optimal coarse-grained arc consistency algorithm. Artif. Intell. 165, 2, 165--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bessière, C., Stergiou, K., and Walsh, T. 2008. Domain filtering consistencies for non-binary constraints. Artif. Intell. 172, 6-7, 800--822. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Boussemart, F., Hemery, F., and Lecoutre, C. 2004. Revision ordering heuristics for the constraint satisfaction problem. In Proceedings of the 1st International Workshop on Constraint Propagation and Implementation. 29--43. http://www.cril.univ-artois.fr/~lecoutre/research/publications/2004/CPW2004.pdf.Google ScholarGoogle Scholar
  9. Boussemart, F., Hemery, F., Lecoutre, C., and Saïs, L. 2004. Boosting systematic search by weighting constraints. In Proceedings of the 16th European Conference on Artificial Intelligence. Springer, Berlin, 146--150. http://www.cril.univ-artois.fr/~lecoutre/research/publications/2004/ECAI2004.pdf.Google ScholarGoogle Scholar
  10. Boutselakis, H., Dimitropoulos, D., Fillon, J., Golovin, A., Henrick, K., Hussain, A., Ionides, J., John, M., Keller, P. A., et al. 2003. MSD: The European Bioinformatics Institute macromolecular structure database. Nucleic Acids Res. 31, 1, 458--462.Google ScholarGoogle ScholarCross RefCross Ref
  11. Briggs, P. and Torczon, L. 1993. An efficient representation for sparse sets. ACM Lett. Program. Lang. Syst. 2, 1--4, 59--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Brown, N. 2009. Chemo-informatics: An introduction for computer scientists. ACM Comput. Surv. 41, 2, 1--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Chang, C. C. and Lee, S. Y. 1991. Retrieval of similar pictures on pictorial databases. Pattern Recogn. 24, 7, 675--681. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Cheng, J., Ke, Y., and Ng, W. 2009. Efficient query processing on graph databases. ACM Trans. Datab. Syst. 34, 1, 1--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Cheng, K. C. and Yap, R. H. 2008. Maintaining arc consistency on ad-hoc r-ary constraints. In Proceedings of the 14th International Conference on Principles and Practice of Constraint Programming. Springer, Berlin, 509--523. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Cheng, K. C. and Yap, R. H. 2010. An mdd-based generalized arc consistency algorithm for positive and negative table constraints and some global constraints. Constraints 15, 2, 265--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chisholm, J. A. and Motherwell, S. 2004. A new algorithm for performing three-dimensional searches of the Cambridge structural database. J. Appl. Crystallogr. 37, 331--334.Google ScholarGoogle ScholarCross RefCross Ref
  18. Chmeiss, A. and Saïs, L. 2004. Constraint satisfaction problems: Back-track search revisited. In Proceedings of the 16thInternational Conference on Tools with Artificial Intelligence. IEEE, Los Alamitos, CA, 252--257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Chou, Y.-Y. and Shapiro, L. G. 1998. Probabilistic relational indexing. In Proceedings of 14th International Conference on Pattern Recognition. Springer, Berlin, 1331--1335. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Conte, D., Foggia, P., Sansone, C., and Vento, M. 2004. Thirty years of graph matching in pattern recognition. Int. J. Patt. Recognit. Artif. Intell. 18, 3, 265--298.Google ScholarGoogle ScholarCross RefCross Ref
  21. Cook, S. A. 1971. The complexity of theorem-proving procedures. In Proceedings of the 3rd Symposium on Theory of Computing. ACM, New York, 151--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Cordella, L. P., Foggia, P., Sansone, C., and Vento, M. 2004. A (sub)graph isomorphism algorithm for matching large graphs. IEEE Trans. Patt. Anal. Mach. Intell. 26, 10, 1367--1372. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Corneil, D. and Kirkpatrick, D. 1980. A theoretical analysis of various heuristics for the graph isomorphism problem. SIAM J. Comput. 9, 2, 281--297.Google ScholarGoogle ScholarCross RefCross Ref
  24. Daylight Chemical Information Systems, Inc. 2007. http://www.daylight.com/dayhtml/doc/theory/theory. finger.html. See Section 6.1.2.Google ScholarGoogle Scholar
  25. de la Briandais, R. 1959. File searching using variable length keys. In Proceedings of the Western Joint Computer Conference. 295--298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Durand, P., Labarre, L., Meil, A., Divo, J.-L., Vandenbrouck, Y., Viari, A., and Wojcik, J. 2006. Genolink: A graph-based querying and browsing system for investigating the function of genes and proteins. http://www.biomedcentral.com/1471-2105/7/21.Google ScholarGoogle Scholar
  27. Foggia, P., Sansone, C., and Vento, M. 2001. A performance comparison of five algorithms for graph isomorphism. In Proceedings of the 3rd Workshop on Graph-Based Representation. http://www.amalfi.dis.unina.it/people/vento/lavori/gbr01bm.pdf.Google ScholarGoogle Scholar
  28. Fowler, G., Haralick, R. M., Gray, F. G., Feustel, C., and Grinstead, C. 1983. Efficient graph automorphism by vertex partitioning. Artif. Intell. 21, 1--2, 245--269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Gent, I. P., Macintyre, E., Prosser, P., Smith, B. M., andWalsh, T. 2001. Random constraint satisfaction: Flaws and structure. Constraints 6, 4, 345--372. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Giugno, R. and Shasha, D. 2002. Graphgrep: A fast and universal method for querying graphs. In Proceedings of the 16th International Conference on Pattern Recognition. Springer, Berlin, 112--115.Google ScholarGoogle Scholar
  31. Golomb, S. W. and Baumert, L. D. 1965. Back-track programming. J. ACM 12, 4, 516--524. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Gomes, C. P., Selman, B., Crato, N., and Kautz, H. 2000. Heavy-tailed phenomena in satisfiability and constraint satisfaction problems. J. Autom. Reason. 24, 1--2, 67--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Haralick, R. M. and Elliott, G. 1980. Increasing tree search efficiency for constraint satisfaction problems. Artif. Intell. 14, 263--313.Google ScholarGoogle ScholarCross RefCross Ref
  34. Hassan, T. 2009. User-guided wrapping of pdf documents using graph matching techniques. In Proceedings of the 10th International Conference on Document Analysis and Recognition. IEEE, Los Alamitos, CA, 631--635. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Hopcroft, J. E. and Karp, R. M. 1973. An n5/2 algorithm for maximum matchings in bipartite graphs. SIAM J. Comput. 2, 4, 225--231.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Hulubei, T. and O'Sullivan, B. 2006. The impact of search heuristics on heavy-tailed behaviour. Constraints 11, 2--3, 159--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jiang, H., Wang, H., Yu, P. S., and Zhou, S. 2007. GString: A novel approach for efficient search in graph databases. In Proceedings of the International Conference on Data Engineering. IEEE, Los Alamitos, CA, 566--575.Google ScholarGoogle Scholar
  38. Klukas, C., Koschützki, D., and Schreiber, F. 2005. Graph pattern analysis with pattern-gravisto. J. Graph Algor. Appl. 9, 1, 19--29.Google ScholarGoogle ScholarCross RefCross Ref
  39. Knuth, D. E. 2000. Dancing links. In Millennial Perspectives in Computer Science, J. Davies, W. Roscoe, and J. Woodcock, Eds., Palgrave, Houndmills, Basingstoke, Hampshire, UK, 187--214.Google ScholarGoogle Scholar
  40. Kohler, E., Morris, R., and Chen, B. 2002. Programming language optimizations for modular router configurations. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York, 251--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Larrosa, J. and Valiente, G. 2002. Constraint satisfaction algorithms for graph pattern matching. Math. Struct. Compt. Sci. 12, 4, 403--422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Leach, A. R. and Gillet, V. J. 2003. An Introduction to Chemo-Informatics. Kluwer Academic Publishers, Dordrecht, The Netherlands. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Lecoutre, C. 2008. Optimization of simple tabular reduction for table constraints. In Proceedings of the 14th International Conference on Principles and Practice of Constraint Programming. Springer-Verlag, Berlin, Heidelberg, 128--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Lecoutre, C. 2009. Constraint Networks: Techniques and Algorithms. John Wiley and Sons, Hoboken, NJ. Google ScholarGoogle ScholarCross RefCross Ref
  45. Lecoutre, C. and Vion, J. 2008. Enforcing arc consistency using bitwise operations. Constraint Program. Lett. 2, 21--35.Google ScholarGoogle Scholar
  46. Lynce, I. and Marques-Silva, J. P. 2003. An overview of backtrack search satisfiability algorithms. Ann. Math. Artificial Intell. 37, 3, 307--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Mackworth, A. K. 1977. Consistency in networks of relations. Artif. Intell. 8, 1, 99--118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. McGregor, J. J. 1979. Relational consistency algorithms and their application in finding subgraph and graph isomorphisms. Inf. Sci. 19, 229--250.Google ScholarGoogle ScholarCross RefCross Ref
  49. McKay, B. 2009. Nauty user's guide (version 2.4). http://cs.anu.edu.au/~bdm/nauty/nug.pdf.Google ScholarGoogle Scholar
  50. Messmer, B. T. and Bunke, H. 2000. Efficient subgraph isomorphism detection: A decomposition approach. IEEE Trans. Knowl. Data Eng. 12, 2, 307--323. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Naanaa, W. 2009. A domain decomposition algorithm for constraint satisfaction. J. Exp. Algor. 13, 1.13--1.23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Proschak, E., Wegner, J. K., Schüller, A., Schneider, G., and Fechner, U. 2007. Molecular query language (mql)--a context-free grammar for substructure matching. J. Chem. Inf. Model 47, 2, 295--301.Google ScholarGoogle ScholarCross RefCross Ref
  53. Prosser, P. 1996. An empirical study of phase transitions in binary constraint satisfaction problems. Artif. Intell. 81, 1--2, 81--109.Google ScholarGoogle ScholarCross RefCross Ref
  54. Sabin, D. and Freuder, E. 1994. Contradicting conventional wisdom in constraint satisfaction. In Proceedings of the 2nd International Workshop on Principles and Practice of Constraint Programming. Springer-Verlag, Berlin, 10--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Sabin, D. and Freuder, E. 1997. Understanding and improving the MAC algorithm. In Proceedings of the Conference on Principles and Practice of Constraint Programming. Springer-Verlag, Berlin, 167--181.Google ScholarGoogle Scholar
  56. Schulte, C. 1999. Comparing trailing and copying for constraint programming. In Proceedings of the International Conference on Logic Programming. Massachusetts Institute of Technology, Cambridge, MA, 275--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Shang, H., Zhang, Y., Lin, X., and Yu, J. X. 2008. Taming verification hardness: An efficient algorithm for testing subgraph isomorphism. In Proceedings of the 34th International Conference on Very Large Databases. ACM, New York, 364--375.Google ScholarGoogle Scholar
  58. Shasha, D., Wang, J. T. L., and Giugno, R. 2002. Algorithmics and applications of tree and graph searching. In Proceedings of the 21st Symposium on Principles of Database Systems. ACM, New York, 39--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Smith, B. M. and Dyer, M. 1996. Locating the phase transition in binary constraint satisfaction problems. Artif. Intell. 81, 155--181.Google ScholarGoogle ScholarCross RefCross Ref
  60. Solnon, C. 2010. AllDifferent-based filtering for subgraph isomorphism. Artif. Intell. 174, 12--13, 850--864. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Tarjan, R. and Yannakakis, M. 1984. Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM J. Comput. 13, 566--579. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Ullmann, J. R. 1965. Parallel recognition of idealized line characters. Kybernetik 2, 5, 221--226. http://www.visionbib.com/papers/1965/Kybernetik65.pdf; http://www.visionbib.com/papers/1965/Kybintro.pdf.Google ScholarGoogle ScholarCross RefCross Ref
  63. Ullmann, J. R. 1976. An algorithm for subgraph isomorphism. J. ACM 23, 1, 31--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Ullmann, J. R. 1977. A binary n-gram technique for automatic correction of substitution, deletion, insertion and reversal errors in words. Comput. J. 20, 2, 141--147.Google ScholarGoogle ScholarCross RefCross Ref
  65. Ullmann, J. R. 2007. Partition search for non-binary constraint satisfaction. Inf. Sci. 177, 18, 3639--3678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Wallace, R. J. and Freuder, E. C. 1992. Ordering heuristics for arc consistency algorithms. In Proceedings of the 9th Canadian Conference on Artificial Intelligence. 163--169.Google ScholarGoogle Scholar
  67. Willett, P. 1999. Matching of chemical and biological structures using subgraph and maximal common subgraph isomorphism algorithms. In Rational Drug Design, D. G. Truhlar, W. J. Howe, A. J. Hopfinger, J. D., Blaney, and R. Dammkoehler, Eds., Springer-Verlag, Berlin, 11--38.Google ScholarGoogle Scholar
  68. Willett, P. 2005. Chemoinformatics techniques for data mining in files of two-dimensional and three-dimensional chemical molecules. In Proceedings of the 3rd Conference on the Foundations of Information Science, M. Petitjean. MDPI, Basel, Switzerland. http://www.mdpi.org/fis2005/proceedings.html.Google ScholarGoogle Scholar
  69. Willett, P. 2008. From chemical documentation to chemo-informatics: 50 years of chemical information science. J. Inf. Sci. 34, 4, 477--499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Willett, P., Barnard, J. M., and Downs, G. M. 1998. Chemical similarity searching. J. Chem. Inf. Comput. Sci. 38, 6, 983--996.Google ScholarGoogle ScholarCross RefCross Ref
  71. Yan, X., Yu, P. S., and Han, J. 2005. Graph indexing based on discriminative frequent structure analysis. ACM Trans. Datab. Syst. 30, 4, 960--993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Yan, X., Zhu, F., Yu, P. S., and Han, J. 2006. Feature-based similarity search in graph structures. ACM Trans. Datab. Syst. 31, 4, 1418--1453. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Zampelli, S., Deville, Y., and Solnon, C. 2010. Solving subgraph isomorphism problems with constraint programming. Constraints 15, 3, 327--353. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Zhang, S., Li, S., and Yang, J. 2009. Gaddi: Distance index based subgraph matching in biological networks. In Proceedings of the 12th International Conference on Extending Database Technology. ACM, New York, 192--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Zhao, P., Yu, J. X., and Yu, P. S. 2007. Graph indexing: tree + delta > = graph. In Proceedings of the 33rd international Conference on Very Large Databases. ACM, New York, 938--949. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Zobel, J., Moffat, A., and Ramamohanarao, K. 1998. Inverted files versus signature files for text indexing. ACM Trans. Datab. Syst. 23, 4, 453--490. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Zou, L., Chen, L., Zhang, H., Lu, Y., and Lou, Q. 2008. Summarization graph indexing: Beyond frequent structure-based approach. In Database Systems for Advanced Applications. IEEE, Los Alamitos, CA. 141--155. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Bit-vector algorithms for binary constraint satisfaction and subgraph isomorphism

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader