Decision Support for Safe AI Design

Hibbard, Bill

doi:10.1007/978-3-642-35506-6_13

Bill Hibbard²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7716))

Included in the following conference series:

International Conference on Artificial General Intelligence

1294 Accesses
3 Citations

Abstract

There is considerable interest in ethical designs for artificial intelligence (AI) that do not pose risks to humans. This paper proposes using elements of Hutter’s agent-environment framework to define a decision support system for simulating, visualizing and analyzing AI designs to understand their consequences. The simulations do not have to be accurate predictions of the future; rather they show the futures that an agent design predicts will fulfill its motivations and that can be explored by AI designers to find risks to humans. In order to safely create a simulation model this paper shows that the most probable finite stochastic program to explain a finite history is finitely computable, and that there is an agent that makes such a computation without any unintended instrumental actions. It also discusses the risks of running an AI in a simulated environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Asimov, I.: Runaround. Astounding Science Fiction (1942)
Google Scholar
Bostrom, N.: Ethical issues in advanced artificial intelligence. In: Smit, I., et al. (eds.) Cognitive, Emotive and Ethical Aspects of Decision Making in Humans and in Artificial Intelligence, vol. 2, pp. 12–17. Int. Inst. of Adv. Studies in Sys. Res. and Cybernetics (2003)
Google Scholar
Bostrom, N.: The superintelligent will: Motivation and instrumental rationality in advanced artificial agents. Minds and Machines (forthcoming)
Google Scholar
Chalmers, D.: The Singularity: A Philosophical Analysis. J. Consciousness Studies 17, 7–65 (2010)
Google Scholar
Dewey, D.: Learning What to Value. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS (LNAI), vol. 6830, pp. 309–314. Springer, Heidelberg (2011)
Chapter Google Scholar
Elliott, G.: US Nuclear Weapon Safety and Control. MIT Program in Science, Technology, and Society (2005), http://web.mit.edu/gelliott/Public/sts.072/paper.pdf
Ghahramani, Z.: Learning Dynamic Bayesian Networks. In: Giles, C.L., Gori, M. (eds.) IIASS-EMFCSC-School 1997. LNCS (LNAI), vol. 1387, pp. 168–197. Springer, Heidelberg (1998)
Chapter Google Scholar
Goertzel, B.: Universal ethics: the foundations of compassion in pattern dynamics (2004), http://www.goertzel.org/papers/UniversalEthics.html
Hibbard, B., Santek, D.: The Vis5D system for easy interactive visualization. In: Proc. IEEE Visualization 1990, pp. 129–134 (1990)
Google Scholar
Hibbard, B.: Super-intelligent machines. Computer Graphics 35(1), 11–13 (2001)
Article Google Scholar
Hibbard, B.: The technology of mind and a new social contract. J. Evolution and Technology 17(1), 13–22 (2008)
Google Scholar
Hibbard, B.: Temptation. Rejected for the AGI-09 Workshop on the Future of AI (2009), https://sites.google.com/site/whibbard/g/hibbard_agi09_workshop.pdf
Hibbard, B.: Model-based utility functions. J. Artificial General Intelligence 3(1), 1–24 (2012a)
Article Google Scholar
Hibbard, B.: Avoiding Unintended AI Behaviors. In: Bach, J., Goertzel, B., Iklé, M. (eds.) AGI 2012. LNCS (LNAI), vol. 7716, pp. 107–116. Springer, Heidelberg (2012), https://sites.google.com/site/whibbard/g/hibbard_agi12a.pdf
Google Scholar
Hutter, M.: Universal artificial intelligence: sequential decisions based on algorithmic probability. Springer, Heidelberg (2005)
MATH Google Scholar
Hutter, M.: Feature reinforcement learning: Part I. Unstructured MDPs. J. Artificial General Intelligence 1, 3–24 (2009a)
Article Google Scholar
Hutter, M.: Feature dynamic Bayesian networks. In: Goertzel, B., Hitzler, P., Hutter, M. (eds.) Proc. Second Conf. on AGI, AGI 2009, pp. 67–72. Atlantis Press, Amsterdam (2009b)
Google Scholar
Kurzweil, R.: The singularity is near. Penguin, New York (2005)
Google Scholar
Li, M., Vitanyi, P.: An introduction to Kolmogorov complexity and its applications. Springer, Heidelberg (1997)
MATH Google Scholar
Lloyd, S.: Computational Capacity of the Universe. Phys. Rev. Lett. 88, 237901 (2002)
Article MathSciNet Google Scholar
Muehlhauser, L., Helm, L.: The singularity and machine ethics. In: Eden, Søraker, Moor, Steinhart (eds.) The Singularity Hypothesis: a Scientific and Philosophical Assessment. Springer, Heidleberg (2012)
Google Scholar
Omohundro, S.: The basic AI drive. In: Wang, P., Goertzel, B., Franklin, S. (eds.) Proc. First Conf. on AGI, AGI 2008, pp. 483–492. IOS Press, Amsterdam (2008)
Google Scholar
Puterman, M.L.: Markov Decision Processes - Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
Book MATH Google Scholar
Ring, M., Orseau, L.: Delusion, Survival, and Intelligent Agents. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS (LNAI), vol. 6830, pp. 11–20. Springer, Heidelberg (2011)
Chapter Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. MIT Press (1998)
Google Scholar
Waser, M.: Designing a safe motivational system for intelligent machines. In: Baum, E., Hutter, M., Kitzelmann, E. (eds.) Proc. Third Conf. on AGI, AGI 2010, pp. 170–175. Atlantis Press, Amsterdam (2010)
Google Scholar
Waser, M.: Rational Universal Benevolence: Simpler, Safer, and Wiser Than “Friendly AI”. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS (LNAI), vol. 6830, pp. 153–162. Springer, Heidelberg (2011)
Chapter Google Scholar
Yudkowsky, E.: (2004), http://www.sl4.org/wiki/CoherentExtrapolatedVolition

Download references

Author information

Authors and Affiliations

SSEC, University of Wisconsin, Madison, WI, 53706, USA
Bill Hibbard

Authors

Bill Hibbard
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Humboldt Universität Berlin, Raumerstr. 11, 10437, Berlin, Germany
Joscha Bach
Aidyia Ltd., Unit 612, 6/F, Lu Plaza, 2 Wing Yip Street, Kwun Tong, Hong Kong
Ben Goertzel
Adams State University, Suite 3060, 81101, Alamosa, CO, USA
Matthew Iklé

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hibbard, B. (2012). Decision Support for Safe AI Design. In: Bach, J., Goertzel, B., Iklé, M. (eds) Artificial General Intelligence. AGI 2012. Lecture Notes in Computer Science(), vol 7716. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35506-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-35506-6_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35505-9
Online ISBN: 978-3-642-35506-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics