Abstract
In recent years, there has been an extensive interest in learning the dynamics of systems. For this purpose, a new learning method called learning from interpretation transition has been proposed recently [1]. However, both the run time and the memory space of this algorithm are exponential, so a better data structure and an efficient algorithm have been awaited. In this paper, we propose a new learning algorithm of this method utilizing an efficient data structure inspired from Ordered Binary Decision Diagrams. We show empirically that using this representation we can perform the same learning task faster with less memory space.
Keywords
- Interpretation Transition
- Binary Decision Diagrams (BDD)
- Zero-suppressed Binary Decision Diagrams (ZDD)
- Ground Resolution
- Learning Boolean Networks
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This research was supported in part by the NII research project on “Dynamic Constraint Networks” and by the “Systems Resilience” project at Research Organization of Information and Systems, Japan. We would like to thank Earl Belinger for its help to improve the english quality of the paper.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Inoue, K., Ribeiro, T., Sakama, C.: Learning from interpretation transition. Mach. Learn. (2013). doi:10.1007/s10994-013-5353-8
Muggleton, S., De Raedt, L., Poole, D., Bratko, I., Flach, P., Inoue, K., Srinivasan, A.: Ilp turns 20. Mach. Learn. 86(1), 3–23 (2012)
Inoue, K.: Logic programming for boolean networks. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol. 2, pp. 924–930. AAAI Press (2011)
Inoue, K., Sakama, C.: Oscillating behavior of logic programs. In: Erdem, E., Lee, J., Lierler, Y., Pearce, D. (eds.) Correct Reasoning. LNCS, vol. 7265, pp. 345–362. Springer, Heidelberg (2012)
Van Emden, M.H., Kowalski, R.A.: The semantics of predicate logic as a programming language. J. ACM (JACM) 23(4), 733–742 (1976)
Apt, K.R., Blair, H.A., Walker, A.: Towards a theory of declarative knowledge. In: Minker, J. (ed.) Foundations of Deductive Databases and Logic Programming, pp. 89–149. Morgan Kaufmann, Los Altos (1988)
Akers, S.B.: Binary decision diagrams. IEEE Trans. Comput. 100(6), 509–516 (1978)
Bryant, R.E.: Graph-based algorithms for boolean function manipulation. IEEE Trans. Comput. 100(8), 677–691 (1986)
Aloul, F.A., Mneimneh, M.N., Sakallah, K.A.: Zbdd-based backtrack search sat solver. In: Proceedings of the International Workshop on Logic Synthesis, Lake Tahoe, California (2002)
Minato, S., Arimura, H.: Frequent closed item set mining based on zero-suppressed bdds. Inf. Media Technol. 2(1), 309–316 (2007)
De Raedt, L., Kimmig, A., Toivonen, H.: Problog: A probabilistic prolog and its application in link discovery. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, pp. 2468–2473 (2007)
Simon, L., Del Val, A.: Efficient consequence finding. In: International Joint Conference on Artificial Intelligence, vol. 17, pp. 359–370. Lawrence Erlbaum Associates Ltd. (2001)
Inoue, K., Sato, T., Ishihata, M., Kameya, Y., Nabeshima, H.: Evaluating abductive hypotheses using an em algorithm on bdds. In: Proceedings of the 21st International Jont Conference on Artifical Intelligence, pp. 810–815. Morgan Kaufmann Publishers Inc. (2009)
Bryant, R.E., Meinel, C.: Ordered binary decision diagrams. In: Hassoun, S., Sasao, T. (eds.) Logic Synthesis and Verification, pp. 285–307. Springer, New York (2002)
Bryant, R.E.: Symbolic boolean manipulation with ordered binary-decision diagrams. ACM Comput. Surv. (CSUR) 24(3), 293–318 (1992)
Minato, S.: Zero-suppressed bdds for set manipulation in combinatorial problems. In: 30th Conference on Design Automation, pp. 272–277. IEEE (1993)
Plotkin, G.D.: A note on inductive generalization. Mach. Intell. 5(1), 153–163 (1970)
Dubrova, E., Teslenko, M.: A sat-based algorithm for finding attractors in synchronous boolean networks. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 8(5), 1393–1399 (2011)
Gebser, M., Kaminski, R., Kaufmann, B., Schaub, T.: Answer Set Solving in Practice. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan and Claypool Publishers, San Rafael (2012)
Groote, J.F., Tveretina, O.: Binary decision diagrams for first-order predicate logic. J. Logic Algebraic Program. 57(1), 1–22 (2003)
Liaw, H.T., Lin, C.S.: On the obdd-representation of general boolean functions. IEEE Trans. Comput. 41(6), 661–664 (1992)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
1.1 A.1 Proof of Theorem 1
Proof
Let \(n\) be the size of the Herbrand base \(|B|\). This \(n\) is also the number of possible heads of rules. Furthermore, \(n\) is also the maximum size of a rule, i.e. the number of literals in the body; a literal can appear at most one time in the body of a rule. For each head there are \(3^n\) possible bodies: each literal can either be positive, negative or absent of the body. From these preliminaries we conclude that the size of an NLP \(|P|\) learned by LF1T is at most \(n\cdot 3^n\). But thanks to ground resolution, \(|P|\) cannot exceed \(n\cdot 2^n\); in the worst case, \(P\) contains only rules of size \(n\) where all literals appear and there is only \(n\cdot 2^n\) such rules. If \(P\) contains a rule with \(m\) literals (\(m<n\)), this rule subsumes \(2^{n-m}\) rules which cannot appear in \(P\). Finally, ground resolution also ensures that \(P\) does not contain any pair of complementary rules, so that the complexity is further divided by \(n\); that is, \(|P|\) is bounded by \(O(\frac{n\cdot 2^n}{n})=O(2^n)\).
In our approach, a BDD represents all rules of \(P\) that have the same head, so that we have \(n\) BDD structures. When \(|P|=2^n\), each BDD represents \(2^n/n\) rules of size \(n\) and are bound by \(O(2^n/n)\), which is the upper bound size of a BDD for any Boolean function [21]. Because BDD merges common parts of rules, it is possible that a BDD that represents \(2^n/n\) rules needs less than \(2^n/n\) memory space. In the previous approach, in the worst case \(|P|=2^n\), whereas in our approach \(|P| \le 2^n\). Our new algorithm still remains in the same order of complexity regarding memory size: \(O(2^n)\).
Regarding learning, each operation has its own complexity. Let \(k\) be the place of a literal in the variable ordering so that for the starting node literal of a BDD \(k=0\). In our BDD, a node has at most \(2\cdot ((n-k)-1)\) children: \((n-k)-1\) positive and negative links to all literals which are superior to \(k\) in the ordering. Insertion of a rule is done in polynomial time; in the worst case, we insert a rule where only one literal that differs from the BDD. Because we follow only the first common literals, we have to check at most \(2\cdot ((n-k)-1)\) links on \(n-1\) nodes, which belongs to \(O(n^2)\).
Subsumption as well as generalization checks require exponential time. In the case of subsumption, in the worst case the BDD contains \(2^n/n\) rules and the rule is not subsumed by any of them.
That means that we have to check every rule, and each check belongs to \(O(n^2)\) so that the whole subsumption operation belongs to \(O(n^2\cdot 2^n/n)=O(2^n)\). To clear the BDD we have to perform the inverse operation. We always have to check the whole BDD, so if the size of the BDD is \(2^n\) then the complexity of the whole clear check also belongs to \(O(2^n)\).
To generalize the new rule we have to check if the BDD subsumes one of its complementary rules. Like for subsumption, in the worst case we have to check every rule. A rule can be generalized at most \(n\) times; for each generalization we have to check at most \(n\) complementary rules, so the complexity of a complete generalization belongs to \(O(n^2\cdot 2^n/n)=O(2^n)\). For the complexity of generalization of BDD rules we consider the inverse problem. In the worst case, every rule of the BDD can be generalized by the new one. Because the new rule does not cover any rules of the BDD, it can generalize each rule of the BDD at most one time. Then, we have at most \(2^n/n\) possible direct generalizations on the whole BDD. In the worst case, each of them can be generalized at most \(n-1\) times, and like before, for each generalization we have to check at most \(n\) complementary rules. If a rule is generalized \(n\) times it means that its body becomes empty, i.e. the rule is a fact, and it will subsume and clear the whole BDD. Then, the complexity of a complete generalization of the BDD belongs to \(O(2^n/n\cdot (n-1)\cdot n)=O(2^n)\).
Each time we learn a rule from a step transition we have to perform these four checks which have a complexity of \(O(n^2+2^n+2^n+2^n)=O(2^n)\). From \(2^n\) state transitions, LF1T can directly infer \(n\cdot 2^n\) rules. Learning the dynamics of the entire input implies in the worst case \(2^n\cdot 2^n\) operations which belong to \(O(4^n)\). Using our dedicated BDD structure the memory complexity as well as the computational complexity of LF1T remains the same order as the previous algorithm based on ground resolution: respectively \(O(2^n)\) and \(O(4^n)\).
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ribeiro, T., Inoue, K., Sakama, C. (2014). A BDD-Based Algorithm for Learning from Interpretation Transition. In: Zaverucha, G., Santos Costa, V., Paes, A. (eds) Inductive Logic Programming. ILP 2013. Lecture Notes in Computer Science(), vol 8812. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44923-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-662-44923-3_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44922-6
Online ISBN: 978-3-662-44923-3
eBook Packages: Computer ScienceComputer Science (R0)