Toward a Model of Intelligence as an Economy of Agents

Baum, Eric B.

doi:10.1023/A:1007593124513

Toward a Model of Intelligence as an Economy of Agents

Published: May 1999

Volume 35, pages 155–185, (1999)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Toward a Model of Intelligence as an Economy of Agents

Download PDF

Eric B. Baum¹

1018 Accesses
26 Citations
4 Altmetric
Explore all metrics

Abstract

A market-based algorithm is presented which autonomously apportions complex tasks to multiple cooperating agents giving each agent the motivation of improving performance of the whole system. A specific model, called “The Hayek Machine” is proposed and tested on a simulated Blocks World (BW) planning problem. Hayek learns to solve more complex BW problems than any previous learning algorithm. Given intermediate reward and simple features, it has learned to efficiently solve arbitrary BW problems. The Hayek Machine can also be seen as a model of evolutionary economics.

References

Anderson, E.S. (1996). Evolutionary Economics: Post-schumpeterian contributions. London: Pinter Publishers.
Google Scholar
Anderson, P.W., Arrow, K.J., & Pines, D. (1998). The economy as an evolving complex system. Redwood City, CA: Addison Wesley.
Google Scholar
Bacchus, F., & Kabanza, F. (1995). Using temporal logic to control search in planning. Unpublished document available from http://logos.uwaterloo.ca/tlplan/tlplan.html. A short version was presented at the European Workshop on Planning.
Baum, E.B. (1996). Toward a model of mind as a laissez-faire economy of idiots, extended abstract. In L. Saitta (Ed.), Proc. 13th ICML '96 (pp. 28–36). San Francisco, CA: Morgan Kaufman.
Google Scholar
Baum, E.B. (1998). Manifesto for an evolutionary economics of intelligence. In C.M. Bishop (Ed.), Neural networks and machine learning. Springer-Verlag.
Baum, E.B., Boneh, D., & Garrett, C. (1995). On genetic algorithms. COLT '95: Proceedings of the Eighth Annual Conference on Computational Learning Theory (pp. 230–239). New York: Association for Computing Machinery.
Google Scholar
Baum, E.B., & Durdanovic, I. (1998a). Emergent planning by an artificial economy. Submitted for publication.
Baum, E.B., & Durdanovic, I. (1998b). Toward code evolution by artificial economies. In L.F. Landweber and E. Wintree (Eds.), Evaluation as Computation, Springer Verlag, 1999, and available at http://www.neci.nj.nec.com:80/homepages/eric/eric.html.
Bertsekas, D.P., & Tsitsiklis, D.P. (1996). Neuro-dynamic programming. Belmont, MA: Athena Scientific.
Google Scholar
Birk, A., & Paul, W.J. (1994). Schemas and genetic programming. Conference on Integration of Elementary Functions into Complex Behavior, Bielefeld.
Carbonell, J.G., Blythe, J., Etzioni, O., Gill, Y., Joseph, R., Khan, D., Knoblock, C., Minton, S., Perez, A., Reilly, S., Veloso, M., & Wang, X. (1992). Prodigy 4.0: The manual and tutorial. Technical Report CMU-CS-92-150, School of Computer Science.
S.H. Clearwater (Ed.). (1996). Market-based control, a paradigm for distributed resource allocation. Singapore: World Scientific.
Google Scholar
Coase, R.H. (1960). The theory of social cost. Journal of Law and Economics, 3(1), 1–44.
Google Scholar
Cosimides, L., & Tooby, J. (1992). Cognitive adaptations for social exchange. In J.H. Barkow, L. Cosimidies, & J. Tooby (Eds.), The adapted mind. New York: Oxford University Press.
Google Scholar
Crites, R.H., & Barto, A.G. (1996). Improving elevator performance using reinforcement learning. In D.S. Touretsky, M.C. Mozer, & M.E. Hasselmo (Eds.), Advances in neural information processing systems (Vol. 8, pp. 1017–1023). Cambridge, MA: MIT Press.
Google Scholar
Dayan, P., & Sejnowski, T.J. (1994). Td converges with probability 1. Machine Learning, 14(3).
Dennett, D.C. (1991). Consciousness explained. Brown, Boston. Little.
Google Scholar
Drescher, G.L. (1991). Made-up minds. MIT Press.
Dzeroski, S., Blockeel, H., & DeRaedt, L. (1998). Relational reinforcement learning. In J. Shavlik (Ed.), Proceedings of the 12th International Conference on Machine Learning, San Mateo, CA: Morgan Kaufman.
Google Scholar
Estlin, T.A., & Mooney, R.J. (1996). Multi-strategy learning of search control for partial-order planning. Proceedings of the Thirteenth National Conference on Aritificial Intelligence (pp. 843–848).
Forrest, S. (1985). Implementing semantic network structures using the classifiersystem. Proc. First International Conference on Genetic Algorithms (pp. 188–196). Hillsdale, NJ: Lawrence Erlbaum Associates.
Google Scholar
Fox, P. (1997). Functional volume models: System level models for funcational neuroimaging. In International Conference on Neural Networks.
Gurvits, L., Lin, L.-J., & Hanson, S.J. (1994). Incremental learning of evaluation functions for absorbing markov chains: New methods and theorems. Unpublished report.
Hardin, G. (1968). The tragedy of the commons. Science, 162, 1243–1248.
Google Scholar
Holland, J.H. (1986). Escaping brittleness: The possibilities of general purpose learning algorithms applied to parallel rule-based systems. In R.S. Michalski, J.G. Carbonell, & T.M. Mitchell (Eds.), Machine learning (Vol. 2, pp. 593–623). Los Altos, CA: Morgan Kauffman.
Google Scholar
Holland, J.H. (1995). Hidden order. Reading, MA: Addison-Wesley.
Google Scholar
Humphrys, M. (1996). Action selection methods using reinforcement learning. In P. Maes, M. Mataric, J.-A. Meyer, J. Pollack, & S.W. Wilson (Eds.), From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior (pp. 135–144). Cambridge MA: MIT Press/Bradford Books.
Google Scholar
Koza, J.R. (1992). Genetic programming (pp. 459–470). Cambridge: MIT Press.
Google Scholar
Lang, K. (1995a). Hill climbing beats genetic search on a boolean circuit synthesis task of koza's. The Twelfth International Conference on Machine Learning (pp. 340–343).
Lang, K. (1995b). Comments on a response to..., August 18.
Lenat, D.B. (1983). EURISKO: a program that learns new heuristics and domain concepts, the nature of heuristics III: Program design and results. Artificial Intelligence, 21(1/2), 61–98.
Google Scholar
Lettau, M., & Uhlig, H. (1999). Rule of thumb and dynamic programming. American Economic Review, in press.
Lloyd, W. (1833). Two lectures on the checks to population. Oxford: Oxford University Press.
Google Scholar
Luria, A.R. (1973). The working brain, an introduction to neuropsychology. New York: Basic Books.
Google Scholar
Maes, P. (1990). How to do the right thing. Connection Science, 1(3).
McAllester, D., & Rosenblitt, D. (1991). Systematic nonlinear planning. Proceedings of the AAAI National Conference.
Miller, M.S., & Drexler, K.E. (1988a). Markets and computation: Agoric open systems. In B.A. Huberman (Ed.), The ecology of computation, number 2 in Studies in Computer Science and Artificial Intelligence (pp. 133–176). New York: North Holland.
Google Scholar
Miller, M.S., & Drexler, K.E. (1988b). Comparative ecology. In B.A. Huberman (Ed.), The ecology of computation, number 2 in Studies in Computer Science and Artificial Intelligence (pp. 51–76). New York: North Holland.
Google Scholar
Minsky, M. (1986). The society of mind. New York: Simon and Schuster.
Google Scholar
Minsky, M. (1995). Steps towards artificial intelligence. In E.A. Feigenbaum & J. Feldman (Eds.). Computers and thought. Menlo Park: AAAI Press.
Google Scholar
Nelson, R.R., & Winter, S.G. (1994). An evolutionary theory of economic change, volume 5th Printing. Harvard University Press.
Newell, A. (1990). Unified theories of cognition. Cambridge: Harvard University Press.
Google Scholar
Palmer, R.G., Arthur, W.B., Holland, J.H., LeBaron, B., & Tayler, P. (1994). Artificial economic life: A simple model of a stockmarket. Physica D 75 (pp. 264–274).
Google Scholar
Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Learning internal representations by error propagation. In D.E. Rumelhart & J.L. McClelland (Eds.), Parallel distributed processing. Cambridge: MIT Press.
Google Scholar
Schmidhuber, J. (1989). The Neural Bucket Brigade: A local learning algorithm for dynamic feedforward and recurrent networks. Connection Science, 1(4), 403–412.
Google Scholar
Schuurmans, D., & Schaeffer, J. (1989). Representational difficulties with classifier systems. Proceedings of International Conference on Genetic Algorithms (pp. 328–333), Fairfax, VA.
Selfridge, O.G. (1959). Pandemonium: A paradigm for learning. Proceedings of the Symposium on Mechanisation of Thought Process. National Physics Laboratory.
Simon, H.A. (1987). Bounded rationality. In J. Eatwell, M. Millgate, & P. Newman (Eds.), The new palgrave: A dictionary of economics. London and Basingstoke: Macmillan.
Google Scholar
Soderlan, S., Barrett, T., & Weld, D. (1990). The snlp planner implementation, contact bug-snlp@cs.washington.edu.
Sutton, R.S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9–44.
Google Scholar
Sutton, R.S., & Barto, A.G. (1998). Reinforcement learning, an introduction. Cambridge: MIT Press.
Google Scholar
Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning, 8, 257–277.
Google Scholar
Tesauro, G. (1995). Temporal difference learning and td-gammon. Communications of the ACM, 38(3), 58–68.
Google Scholar
Toga, A.W., & Mazziotta, J.C. (1996). Brain mapping, the methods. San Diego: Academic Press.
Google Scholar
Valiant, L. (1994). Circuits of the mind. Oxford University Press.
Valiant, L. (1995). Rationality. In Proceedings of the Eighth Annual Conference on Computational Learning Theory (pp. 3–14).
Venturini, G. (1994). Adaption in dynamic environments through a minimal probability of exploration. In Proceedings of the Third International Conference on Simulation of Adaptive Behavior (pp. 371–379). Cambridge, MA: MIT Press.
Google Scholar
Watkins, C.J.C.H. (1989). Learning from delayed rewards. Ph.D. thesis, Cambridge University.
Wellman, M.P. (1993). A market oriented programming environment and its application to distributed multicommodity flow problems. Journal of Artificial Intelligence Research, 1, 1–23.
Google Scholar
Whitehead, S.D., & Ballard, D.H. (1991). Learning to perceive and act. Machine Learning, 7(1), 45–83.
Google Scholar
Wilson, S.W. (1995). Classifier fitness based on accuracy. Evolutionary Computation, 3(2), 149–175.
Google Scholar
Wilson, S.W., & Goldberg, D.E. (1998). A critical review of classifier systems. Proceedings of the Third International Conference on Genetic Algorithms, San Mateo, CA: Morgan Kauffman.
Google Scholar
Winograd, T. (1972). Understanding natural language. New York: Academic Press.
Google Scholar
Zang, W., & Dietterich, T.G. (1996). High-performance job-shop scheduling with a time-delay td (lambda) network. In D.S. Touretszky, M.C. Mozer, & M.E Haselmo (Eds.), Advances in neural information processing systems (Vol. 8, pp. 1024–1030).

Download references

Author information

Authors and Affiliations

NEC Research Institute, 4 Independence Way, Princeton, NJ, 08540
Eric B. Baum

Authors

Eric B. Baum
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baum, E.B. Toward a Model of Intelligence as an Economy of Agents. Machine Learning 35, 155–185 (1999). https://doi.org/10.1023/A:1007593124513

Download citation

Issue Date: May 1999
DOI: https://doi.org/10.1023/A:1007593124513

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Toward a Model of Intelligence as an Economy of Agents

Abstract

Article PDF

Similar content being viewed by others

The argument for near-term human disempowerment through AI

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Generative Artificial Intelligence

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Toward a Model of Intelligence as an Economy of Agents

Abstract

Article PDF

Similar content being viewed by others

The argument for near-term human disempowerment through AI

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Generative Artificial Intelligence

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation