Skip to main content

Decision Support for Safe AI Design

  • Conference paper
Artificial General Intelligence (AGI 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7716))

Included in the following conference series:

Abstract

There is considerable interest in ethical designs for artificial intelligence (AI) that do not pose risks to humans. This paper proposes using elements of Hutter’s agent-environment framework to define a decision support system for simulating, visualizing and analyzing AI designs to understand their consequences. The simulations do not have to be accurate predictions of the future; rather they show the futures that an agent design predicts will fulfill its motivations and that can be explored by AI designers to find risks to humans. In order to safely create a simulation model this paper shows that the most probable finite stochastic program to explain a finite history is finitely computable, and that there is an agent that makes such a computation without any unintended instrumental actions. It also discusses the risks of running an AI in a simulated environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asimov, I.: Runaround. Astounding Science Fiction (1942)

    Google Scholar 

  2. Bostrom, N.: Ethical issues in advanced artificial intelligence. In: Smit, I., et al. (eds.) Cognitive, Emotive and Ethical Aspects of Decision Making in Humans and in Artificial Intelligence, vol. 2, pp. 12–17. Int. Inst. of Adv. Studies in Sys. Res. and Cybernetics (2003)

    Google Scholar 

  3. Bostrom, N.: The superintelligent will: Motivation and instrumental rationality in advanced artificial agents. Minds and Machines (forthcoming)

    Google Scholar 

  4. Chalmers, D.: The Singularity: A Philosophical Analysis. J. Consciousness Studies 17, 7–65 (2010)

    Google Scholar 

  5. Dewey, D.: Learning What to Value. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS (LNAI), vol. 6830, pp. 309–314. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  6. Elliott, G.: US Nuclear Weapon Safety and Control. MIT Program in Science, Technology, and Society (2005), http://web.mit.edu/gelliott/Public/sts.072/paper.pdf

  7. Ghahramani, Z.: Learning Dynamic Bayesian Networks. In: Giles, C.L., Gori, M. (eds.) IIASS-EMFCSC-School 1997. LNCS (LNAI), vol. 1387, pp. 168–197. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  8. Goertzel, B.: Universal ethics: the foundations of compassion in pattern dynamics (2004), http://www.goertzel.org/papers/UniversalEthics.html

  9. Hibbard, B., Santek, D.: The Vis5D system for easy interactive visualization. In: Proc. IEEE Visualization 1990, pp. 129–134 (1990)

    Google Scholar 

  10. Hibbard, B.: Super-intelligent machines. Computer Graphics 35(1), 11–13 (2001)

    Article  Google Scholar 

  11. Hibbard, B.: The technology of mind and a new social contract. J. Evolution and Technology 17(1), 13–22 (2008)

    Google Scholar 

  12. Hibbard, B.: Temptation. Rejected for the AGI-09 Workshop on the Future of AI (2009), https://sites.google.com/site/whibbard/g/hibbard_agi09_workshop.pdf

  13. Hibbard, B.: Model-based utility functions. J. Artificial General Intelligence 3(1), 1–24 (2012a)

    Article  Google Scholar 

  14. Hibbard, B.: Avoiding Unintended AI Behaviors. In: Bach, J., Goertzel, B., Iklé, M. (eds.) AGI 2012. LNCS (LNAI), vol. 7716, pp. 107–116. Springer, Heidelberg (2012), https://sites.google.com/site/whibbard/g/hibbard_agi12a.pdf

    Google Scholar 

  15. Hutter, M.: Universal artificial intelligence: sequential decisions based on algorithmic probability. Springer, Heidelberg (2005)

    MATH  Google Scholar 

  16. Hutter, M.: Feature reinforcement learning: Part I. Unstructured MDPs. J. Artificial General Intelligence 1, 3–24 (2009a)

    Article  Google Scholar 

  17. Hutter, M.: Feature dynamic Bayesian networks. In: Goertzel, B., Hitzler, P., Hutter, M. (eds.) Proc. Second Conf. on AGI, AGI 2009, pp. 67–72. Atlantis Press, Amsterdam (2009b)

    Google Scholar 

  18. Kurzweil, R.: The singularity is near. Penguin, New York (2005)

    Google Scholar 

  19. Li, M., Vitanyi, P.: An introduction to Kolmogorov complexity and its applications. Springer, Heidelberg (1997)

    MATH  Google Scholar 

  20. Lloyd, S.: Computational Capacity of the Universe. Phys. Rev. Lett. 88, 237901 (2002)

    Article  MathSciNet  Google Scholar 

  21. Muehlhauser, L., Helm, L.: The singularity and machine ethics. In: Eden, Søraker, Moor, Steinhart (eds.) The Singularity Hypothesis: a Scientific and Philosophical Assessment. Springer, Heidleberg (2012)

    Google Scholar 

  22. Omohundro, S.: The basic AI drive. In: Wang, P., Goertzel, B., Franklin, S. (eds.) Proc. First Conf. on AGI, AGI 2008, pp. 483–492. IOS Press, Amsterdam (2008)

    Google Scholar 

  23. Puterman, M.L.: Markov Decision Processes - Discrete Stochastic Dynamic Programming. Wiley, New York (1994)

    Book  MATH  Google Scholar 

  24. Ring, M., Orseau, L.: Delusion, Survival, and Intelligent Agents. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS (LNAI), vol. 6830, pp. 11–20. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  25. Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. MIT Press (1998)

    Google Scholar 

  26. Waser, M.: Designing a safe motivational system for intelligent machines. In: Baum, E., Hutter, M., Kitzelmann, E. (eds.) Proc. Third Conf. on AGI, AGI 2010, pp. 170–175. Atlantis Press, Amsterdam (2010)

    Google Scholar 

  27. Waser, M.: Rational Universal Benevolence: Simpler, Safer, and Wiser Than “Friendly AI”. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS (LNAI), vol. 6830, pp. 153–162. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  28. Yudkowsky, E.: (2004), http://www.sl4.org/wiki/CoherentExtrapolatedVolition

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hibbard, B. (2012). Decision Support for Safe AI Design. In: Bach, J., Goertzel, B., Iklé, M. (eds) Artificial General Intelligence. AGI 2012. Lecture Notes in Computer Science(), vol 7716. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35506-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35506-6_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35505-9

  • Online ISBN: 978-3-642-35506-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics