Copyright © 1996 Published by Elsevier Science B.V.
A statistical approach to adaptive problem solving
Jonathan Gratcha,
,
and Gerald DeJong
, b
Available online 16 February 1999.
Abstract
Domain independent general purpose problem solving techniques are desirable from the standpoints of software engineering and human computer interaction. They employ declarative and modular knowledge representations and present a constant homogeneous interface to the user, untainted by the peculiarities of the specific domain of interest. Unfortunately, this very insulation from domain details often precludes effective problem solving behavior. General approaches have proven successful in complex real-world situations only after a tedious cycle of manual experimentation and modification. Machine learning offers the prospect of automating this adaptation cycle, reducing the burden of domain specific tuning and reconciling the conflicting needs of generality and efficacy. A principal impediment to adaptive techniques is the utility problem: even if the acquired information is accurate and is helpful in isolated cases, it may degrade overall problem solving performance under difficult to predict circumstances. We develop a formal characterization of the utility problem and introduce COMPOSER, a statistically rigorous learning approach which avoids the utility problem. COMPOSER has been successfully applied to learning heuristics for planning and scheduling systems. This article includes theoretical results and an extensive empirical evaluation. The approach is shown to outperform significantly several other leading approaches to the utility problem.
References
[1]. R.E. Bechhofer, A single-sample multiple decision procedure for ranking means of normal populations with known variances. Ann. Math. Stat. 25 1 (1954).
[2]. J.O. Berger. In: (2nd ed.),Statistical Decision Theory and Bayesian Analysis, Springer, Berlin (1980).
[3]. A. Borgida and D.W. Etherington, Hierarchical knowledge bases and efficient disjunctive reasoning. In: Proceedings First International Conference on Principles of Knowledge Representation and Reasoning (1989), pp. 33–43.
[4]. H. Büringer, H. Martin and K.-H. Scriever. In: (2nd ed.),Nonparametric Sequential Selection Procedures, Birkhäuser, Boston (1980).
[5]. P. Cheeseman, B. Kanefsky and W.M. Taylor, Where the really hard problems are. In: Proceedings IJCAI-89 (1989), pp. 163–169.
[6]. S.A. Chien, J.M. Gratch and M.C. Burl, On the efficient allocation of resources for hypothesis evaluation in machine learning: a statistical approach. IEEE Trans. Pattern Anal. Mach. Intell. 17 (1995), pp. 652–665. View Record in Scopus | Cited By in Scopus (5)
[7]. T. Dean and M. Boddy, An analysis of time-dependent planning. In: Proceedings AAAI-88 (1988), pp. 49–54.
[8]. T.L. Dean and M.P. Wellman. In: (2nd ed.),Planning and Control, Morgan Kaufmann, San Mateo, CA (1991).
[9]. R. Dechter, Constraint networks. In: (2nd ed.),S.C. Shapiro, Editor, Encyclopedia of Artificial Intelligence, Wiley, New York (1992).
[10]. R. Dechter and J. Pearl, Network-based heuristics for constraint-satisfaction problems. Artif. Intell. 34 (1987), pp. 1–38. Abstract |
PDF (1799 K)
| View Record in Scopus | Cited By in Scopus (162)
[11]. M.H. DeGroot. In: (2nd ed.),Optimal Statistical Decisions, McGraw-Hill, New York (1970).
[12]. G.F. DeJong and R.J. Mooney, Explanation-based learning: an alternative view. Mach. Learn. 1 (1986), pp. 145–176. View Record in Scopus | Cited By in Scopus (81)
[13]. J. Doyle, Rationality and its roles in reasoning (extended version). In: Proceedings AAAI-90 (1990), pp. 1093–1100.
[14]. O. Etzioni, A structural theory of search control. In: (2nd ed.),PhD thesis, Department of Computer Science, Carnegie-Mellon University, Pittsburgh, PA (1990).
[15]. O. Etzioni, Why Prodigy/EBL works. In: Proceedings AAAI-90 (1990), pp. 916–922.
[16]. O. Etzioni, STATIC a problem-space compiler for PRODIGY. In: Proceedings AAAI-91 (1991), pp. 533–540.
[17]. O. Etzioni and S. Minton, Why EBL produces overly-specific knowledge: a critique of the PRODIGY approaches. In: Proceedings Ninth International Conference on Machine Learning (1992), pp. 137–143.
[18]. R.E. Fikes and N.J. Nilsson, STRIPS: a new approach to the application of theorem proving to problem solving. Artif. Intell. 2 (1971), pp. 189–208. Abstract |
PDF (1580 K)
| View Record in Scopus | Cited By in Scopus (554)
[19]. M. Fisher, The Lagrangian relaxation method for solving integer programming problems. Manage. Sci. 27 (1981), pp. 1–18. View Record in Scopus | Cited By in Scopus (445)
[20]. P.W.L. Fong, A quantitative study of hypothesis selection. In: Proceedings Twelfth International Conference on Machine Learning (1995), pp. 226–234.
[21]. J.C. Gittins. In: (2nd ed.),Multi-Armed Bandit Allocation Indices, Wiley, New York (1989).
[22]. A. Goldberg, P.W. Purdom and C.A. Brown, Average time analysis of simplified Davis-Putnam procedures. Inform. Process. Lett. 15 (1982), pp. 72–75. Abstract |
PDF (476 K)
| View Record in Scopus | Cited By in Scopus (15)
[23]. Z. Govindarajulu. In: (2nd ed.),The Sequential Statistical Analysis, American Sciences Press, Columbus, OH (1981).
[24]. J.M. Gratch, COMPOSER: a decision-theoretic approach to adaptive problem solving. In: (2nd ed.),Tech. Rept. UIUCDCS-R-93-1806, Department of Computer Science, University of Illinois, Urbana, IL (1993).
[25]. J. Gratch, On efficient approaches to the utility problem in adaptive problem solving. In: (2nd ed.),PhD thesis, Department of Computer Science, University of Illinois at Urbana-Campaign, Urbana, IL (1995).
[26]. J. Gratch and S. Chien, Learning search control knowledge for the deep space network scheduling problem. In: Proceedings Tenth International Conference on Machine Learning (1993).
[27]. J. Gratch, S. Chien and G. DeJong, Learning search control knowledge to improve schedule quality. In: Proceedings IJCAI93 Scheduling Workshop (1993).
[28]. J. Gratch, S. Chien and G. DeJong, Improving learning performance through rational resource allocation. In: Proceedings AAAI-94 (1994).
[29]. J. Gratch and G. DeJong, A hybrid approach to guaranteed effective control strategies. In: Proceedings Eighth International Workshop on Machine Learning (1991).
[30]. J. Gratch and G. DeJong, COMPOSER: a probabilistic solution to the utility problem in speed-up learning. In: Proceedings AAAI-92 (1992), pp. 235–240. View Record in Scopus | Cited By in Scopus (6)
[31]. J. Gratch and G. DeJong, A framework of simplifications in learning to plan. In: Proceedings First International Conference on Artificial Intelligence Planning Systems (1992), pp. 78–87. View Record in Scopus | Cited By in Scopus (1)
[32]. J. Gratch and G. DeJong, Rational learning: a principled approach to balancing learning and action. In: (2nd ed.),Tech. Rept. UIUCDCS-R-93-1801, Department of Computer Science, University of Illinois, Urbana, IL (1993).
[33]. R. Greiner and W.W. Cohen, Probabilistic hill-climbing. In: Proceedings Computational Learning Theory and “Natural” Learning Systems (1992).
[34]. R. Greiner and C. Elkan, Measuring and improving the effectiveness of representations. In: Proceedings AAAI-91 (1991).
[35]. R. Greiner and I. Jurisica, A statistical approach to solving the EBL utility problem. In: Proceedings AAAI-92 (1992), pp. 241–248. View Record in Scopus | Cited By in Scopus (3)
[36]. R. Greiner and J. Likuski, Incorporating redundant learned rules: a preliminary formal analysis of EBL. In: Proceedings IJCAI-89 (1989), pp. 744–749.
[37]. Y. Hochberg and A.C. Tamhane. In: (2nd ed.),Multiple Comparison Procedures, Wiley, New York (1987).
[38]. R.V. Hogg and A.T. Craig. In: (2nd ed.),Introduction to Mathematical Statistics, Macmillan, New York (1978).
[39]. R.V. Hogg and E.A. Tanis. In: (2nd ed.),Probability and Statistical Inference, Macmillan, New York (1983).
[40]. L.B. Holder, Empirical analysis of the general utility problem in machine learning. In: Proceedings AAAI-92 (1992), pp. 249–254. View Record in Scopus | Cited By in Scopus (0)
[41]. E.J. Horvitz, G.F. Cooper and D.E. Heckerman, Reflection and action under scarce resources: theoretical principles and empirical study. In: Proceedings IJCAI-89 (1989), pp. 1121–1127.
[42]. L.P. Kaelbling. In: (2nd ed.),Learning in Embedded Systems, MIT Press, Cambridge, MA (1993).
[43]. C. Knoblock, Learning hierarchies of abstraction spaces. In: Proceedings Sixth International Conference on Machine Learning (1989), pp. 241–245.
[44]. R.E. Korf, Planning as search: a quantitative approach. Artif. Intell. 33 (1987), pp. 65–88. Abstract |
PDF (1326 K)
| View Record in Scopus | Cited By in Scopus (64)
[45]. P. Laird, Dynamic optimization. In: Proceedings Ninth International Conference on Machine Learning (1992), pp. 263–272.
[46]. J.E. Laird, P.S. Rosenbloom and A. Newell. In: (2nd ed.),Universal Subgoaling and Chunking: The Automatic Generation and Learning of Goal Hierarchies, Kluwer Academic Publishers, Hingham, MA (1986).
[47]. S. Letovsky, Operationality criteria for recursive predicates. In: Proceedings AAAI-90 (1990), pp. 936–941.
[48]. N.J. Lewins, Practical solution-caching for PROLOG: an explanation-based learning approach. In: (2nd ed.),PhD thesis, Department of Computer Science, University of Western Australia (1993).
[49]. N. Littlestone and M.K. Warmuth, The weighted majority algorithm. Inform. Comput. 108 (1994), pp. 212–261. Abstract |
PDF (2160 K)
| View Record in Scopus | Cited By in Scopus (267)
[50]. O. Maron and A.W. Moore, Hoeffding races: accelerating model selection search for classification and function approximation. In: (2nd ed.),Advances in Neural Information Processing Systems 6, Morgan Kaufmann, Los Altos, CA (1994).
[51]. K. Melhorn. In: (2nd ed.),Data Structures and Algorithms 1: Sorting and Searching, Springer, Berlin (1984).
[52]. D.P. Miller, R.S. Desai, E. Gat, R. Ivlev and J. Loch, Reactive navigation through rough terrain: experimental results. In: Proceedings AAAI-92 (1992), pp. 823–828. View Record in Scopus | Cited By in Scopus (9)
[53]. S. Minton. In: Learning Search Control Knowledge: An Explanation-Based Approach, Kluwer Academic Publishers, Norwell, MA (1988).
[54]. T.M. Mitchell, R. Keller and S. Kedar-Cabelli, Explanation-based generalization: a unifying view. Mach. Learn. 1 (1986), pp. 47–80. View Record in Scopus | Cited By in Scopus (143)
[55]. T.M. Mitchell, S. Mahadevan and L.I. Steinberg, LEAP: a learning apprentice for VLSI design. In: Proceedings IJCAI-85 (1985), pp. 573–580.
[56]. D. Mitchell, B. Selman and H. Levesque, Hard and easy distributions of SAT problems. In: Proceedings AAAI-92 (1992), pp. 459–465. View Record in Scopus | Cited By in Scopus (108)
[57]. T.M. Mitchell, R.E. Utgoff and R. Banerji, Learning by experimentation: acquiring and refining problemsolving heuristics. In: R. Michalski, J. Carbonell and T. Mitchell, Editors, Machine Learning: An Artificial Intelligence Approach, Morgan Kaufman, San Mateo, CA (1983).
[58]. A.W. Moore and M.S. Lee, Efficient algorithms for minimizing cross validation error. In: Proceedings Eleventh International Conference on Machine Learning (1994).
[59]. J. Mostow, Mechanical transformation of task heuristics into operational procedures. In: (2nd ed.),PhD thesis, Department of Computer Science, CMU, Pittsburgh, PA (1981).
[60]. A. Nádas, An extension of a theorem of Chow and Robbins on sequential confidence intervals for the mean. Ann. Math. Stat. 40 (1969), pp. 667–671.
[61]. B.K. Natarajan, On learning from exercises. In: Proceedings Second Annual Workshop on Computational Learning Theory (1989), pp. 72–87.
[62]. M.A. Perez and O. Etzioni, DYNAMIC: a new rule for training problems in EBL. In: Proceedings Ninth International Conference on Machine Learning (1992), pp. 367–372.
[63]. B. Roy, Problems and methods with multiple objective functions. Math. Programming 1 2 (1971).
[64]. S. Russell and E. Wefald, Principles of metareasoning. In: Proceedings First International Conference on Principles of Knowledge Representation and Reasoning (1989), pp. 400–411.
[65]. A.L. Samuel, Some studies in machine learning using the game of checkers. IBM J. 3 3 (1959).
[66]. M. Schoppers, Building plans to monitor and exploit open-loop and closed-loop dynamics. In: Proceedings First International Conference on Artificial Intelligence Planning Systems (1992), pp. 204–213. View Record in Scopus | Cited By in Scopus (3)
[67]. U.M. Schwuttke and L. Gasser, Real-time metareasoning with dynamic trade-off evaluation. In: Proceedings AAAI-92 (1992), pp. 500–506. View Record in Scopus | Cited By in Scopus (0)
[68]. D. Subramanian and R. Feldman, The utility of EBL in recursive domain theories. In: Proceedings AAAI-90 (1990), pp. 942–949.
[69]. D. Subramanian and S. Hunter, Measuring utility and the design of provably good EBL algorithms. In: Proceedings Ninth International Conference on Machine Learning (1992), pp. 426–435.
[70]. P. Tadepalli, Learning with inscrutable theories. In: Proceedings Eighth International Workshop on Machine Learning (1991), pp. 544–548.
[71]. M.P. Wellman and J. Doyle, Modular utility representation for decision-theoretic planning. In: Proceedings First International Conference on Artificial Intelligence Planning Systems (1992), pp. 236–242. View Record in Scopus | Cited By in Scopus (14)







E-mail Article
Add to my Quick Links

Cited By in Scopus (3)



