Reinforcement Learning: Insights from Interesting Failures in Parameter Selection

Konen, Wolfgang; Bartz–Beielstein, Thomas

doi:10.1007/978-3-540-87700-4_48

Reinforcement Learning: Insights from Interesting Failures in Parameter Selection

Wolfgang Konen¹⁹ &
Thomas Bartz–Beielstein¹⁹

Conference paper

3503 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5199))

Abstract

We investigate reinforcement learning methods, namely the temporal difference learning TD(λ) algorithm, on game-learning tasks. Small modifications in algorithm setup and parameter choice can have significant impact on success or failure to learn. We demonstrate that small differences in input features influence significantly the learning process. By selecting the right feature set we found good results within only 1/100 of the learning steps reported in the literature. Different metrics for measuring success in a reproducible manner are developed. We discuss why linear output functions are often preferable compared to sigmoid output functions.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R.S.: Learning to predict by the method of temporal differences. Machine Learning 3, 9–44 (1988)
Google Scholar
Tesauro, G.: TD-gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation 6, 215–219 (1994)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Stenmark, M.: Synthesizing board evaluation functions for connect4 using machine learning techniques. Master’s thesis, Østfold University College, Norway (2005)
Google Scholar
Sutton, R.S.: Reinforcement learning FAQ (2008), Cited 20.4.2008, http://www.cs.ualberta.ca/sutton/RL-FAQ.html
Togelius, J., Gomez, F., Schmidhuber, J.: Learning what to ignore: Memetic climbing in weight and topology space. Congress on Evolutionary Computation (to appear, 2008)
Google Scholar
Levkovich, C.: Temporal difference learning project (2008), Cited 10.3.2008, www.geocities.com/chen_levkovich/tdlearningproject.html
Bartz-Beielstein, T.: Experimental Research in Evolutionary Computation—The New Experimentalism. Natural Computing Series. Springer, Heidelberg (2006)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Faculty for Computer Science and Engineering Science, Cologne University of Applied Sciences, 51643, Gummersbach, Germany
Wolfgang Konen & Thomas Bartz–Beielstein

Authors

Wolfgang Konen
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Bartz–Beielstein
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Fakultät für Informatik, Technische Universität Dortmund, 44221, Dortmund, Germany
Günter Rudolph
Fakultät für Informatik, Technische Universität Dortmund, 44221, Dortmund, Germany
Thomas Jansen & Nicola Beume &
Department of Computing and Electronic Systems, University of Essex, CO4 3SQ, Colchester, Essex, UK
Simon Lucas
Dipartimento di Ingegneria Meccanica, Università degli Studi di Trieste, 34127, Trieste, Italy
Carlo Poloni

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Konen, W., Bartz–Beielstein, T. (2008). Reinforcement Learning: Insights from Interesting Failures in Parameter Selection. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds) Parallel Problem Solving from Nature – PPSN X. PPSN 2008. Lecture Notes in Computer Science, vol 5199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87700-4_48

Download citation

DOI: https://doi.org/10.1007/978-3-540-87700-4_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87699-1
Online ISBN: 978-3-540-87700-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics