Reinforcement learning seminar, list of articles for first presentation

Select at least one of the articles. Each article may be studied and presented by 1-3 persons together or separately.

Select an article before 29.1.2004 and send an e-mail to Kary.Framling@hut.fi indicating your selection. The selections will be indicated as soon as possible on this page. The same article can be reserved by a maximum of three persons. If the limit has been exceeded, you will be informed by e-mail so that you can select another article.

All articles can be found on-line on the Internet.

Articles

Barto, Andrew G., Sutton, R.S., Watkins C.J.C.H. (1990). Learning and Sequential Decision Making. In: M. Gabriel and J. Moore (eds.), Learning and computational neuroscience : foundations of adaptive networks, M.I.T. Press, Cambridge, Mass. Extensive article about Reinforcement Learning, including a lot about using artificial neural networks.
Reserved by (max 3): Jarmo Korhonen, Jussi Rautio, Heli Nyholm.

Kaelbling, Leslie Pack, Littman, Michael L., Moore, Andrew W. (1996). Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research, Vol. 4. pp. 237-285. Very broad article from historical, technical and application perspectives. For the uninitiated, however, some explanations might be a little short.
Reserved by (max 3): Sebastian Von Knorring, Juho Törmä, Samuli Kekki.

Mahadevan, Sridhar (1996). Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results. Machine Learning, Special Issue on Reinforcement Learning (edited by Leslie Kaebling), Vol. 22. pp. 159-196. Alternatives to TD-based methods such as Q-learning. A lot of fundamental theory about MDPs, but also interesting experiments on grid worlds.
Reserved by (max 3): Esa Seuranen, Mikko Rahikainen.

Moore, Andrew W., Atkeson, Christopher G. (1993). Prioritized Sweeping: Reinforcement Learning with Less Data and Less Real Time. Machine Learning, Vol. 13. pp. 103-130. Brief introduction to Reinforcement Learning theory, but many good example tasks that are useful. This is why this article is included as a "general" one, even though it is focused on the "prioritized sweeping" method.
Reserved by (max 3): Antti Ukkonen, Jukka Villstedt, Teddy Grenman.

Thrun, S.B. (1992). The role of exploration in learning control. In DA White & DA Sofge, editors, Handbook of Intelligent Control: Neural, Fuzzy and Adaptive Approaches. New York, NY: Van Nostrand Reinhold. This is not a completely general article because it focuses on how the agent balanced between exploring the environment and just using things already learned. Includes interesting example applications.
Reserved by (max 3): Antti Päällysaho, Tapani Raiko.

Whitehead, S.D., Lin L-J. (1995). Reinforcement learning of non-Markov decision processes. Artificial Intelligence, Vol. 73. pp. 271-306. Long introduction to Reinforcement Learning techniques. Focuses on tasks with "hidden state", which often occurs in real-world applications.
Reserved by (max 3): Elina Parviainen, Jaakko Nyrölä, Marko Nikula.

This page is maintained by Kary Främling,
E-mail: Kary.Framling@hut.fi.
Last updated on March 12th, 2004