T-93.850 Seminar on Knowledge EngineeringSpring 2004: Reinforcement learning |
|||||||||||||||||||||
General seminar information |
Working methods: Bibliographic research on specific subjects of reinforcement learning and presenting them to the other seminar participants at different sessions. A seminar report of 5-10 pages on the selected topic is also required. Presentations and the report may be done individually or in groups of 2-3 people. Groups should meet up and discuss their work before the presentation, so that the group work helps all members in the group to understand the subject. Group presentations also have to be partitioned between the group members so that all members present different parts of the work. For the first presentation, all members of a group read the same paper. For the second presentation, all members of a group read papers on the same subject area, so that they can compare the methods, results and conclusions of the different papers. Group compositions may change between the first and the second presentation, but the seminar report is either written individually or with the same group as in the second presentation. Assessment: Accepted/Rejected.
First meeting:
on Thursday, 22 January 2004 at 12-14, Lecture Room T4 in
Computer Science building. Extent: 2 credit units, unless the amount of work turns out to be much greater than expected. Extra credit units can be obtained for instance by:
| ||||||||||||||||||||
About reinforcement learning |
Reinforcement Learning (RL) is largely based on the idea to develop models that would do problem solving and learning in similar ways as humans and animals do. The models should preferrably also correspond to some rough-level knowledge about how the brain operates, i.e. activations and connections between different areas of the brain. Animal problem solving mainly seems to be based on trial and learning. The success or failure of a trial modifies behavior in the "right" direction after some number of trials, where "some number" is in the range one (e.g. learning how to turn on the radio from "power" button) to infinity (e.g. learning how to grab things, which is a life-long adaptation procedure). RL methods have been successfully applied to many problems where more "conventional" methods are difficult to use due to factors like lacking data about the environment, which forces the RL "agent" to explore its environment and learn interactively. Exploring is a procedure where the agent has to take actions without a priori knowledge about how good or bad the action is, which may be known only much later when a goal is reached or when the task failed. |
||||||||||||||||||||
Links |
|
||||||||||||||||||||
Schedule |
Seminar opening, introductory lecture
Meetings for presenting first articleCount about 30 minutes for the presentation, including time for questions and discussion. For this first presentation, the emphasis is on understanding the basic principles of Reinforcement Learning. Therefore the participants should try to understand and explain basic formulas as well as possible. There is also always some "major message(s)" that the author tries to express in the articles. Try to identify it/them as well as possible. Also explain experimental tasks; characteristics of the task, what methods are used, how good are the results and what are the conclusions of them. Finally express your own opinion on the work, i.e. how understandable is the article, does it make you want to search further, what are the open issues etc. For group presentations, count 30-45 minutes for 2 persons and 30-60 minutes for 3 persons. The goal of team work is not just to split up the article in pieces that are presented separately. The goal is rather to discuss and analyze the article in advance and give a more refined analysis of the article than would be possible in an individual presentation. All seminar participants are expected to attend all meetings. However, it is possible to be absent from one of the presentation meetings for the first article without special reasons. If you have to be absent more than once, please contact the lecturer. Also signal any problems with the schedule below as soon as possible.
Meetings for presenting second articleThe list of articles is here. Deadline for selecting the article to present at the second meeting is 1.3.2004. For the second presentation, there is one article per participant, but team work is possible by having the members of the team select articles from the same subject area. The maximal team size is three persons. A team makes a joint presentation, where results and conclusions in the selected articles are compared against each other. There is always some "major message(s)" that the author tries to express in the articles. Try to identify it/them as well as possible. Also explain experimental tasks; characteristics of the task, what methods are used, how good are the results and what are the conclusions of them. Finally express your own opinion on the work, i.e. how understandable is the article, does it make you want to search further, what are the open issues etc. Count about 30 minutes for the presentation. About 15 minutes are reserved for questions and discussion, but this time limit is intended to be flexible. At this presentation, all participants are also expected to show at least a table-of-contents for the seminar report. It is, of course, desirable that all participants attend all meetings. However, with the current schedule of six meetings, only four meetings are compulsory. Send a message with a justified explanation if you can only attend three meetings. Also signal any problems with the schedule below as soon as possible.
Seminar reportDeadline: May 2004 unless special agreement with lecturer. The intention of the seminar report is mainly to show in what way the students have learned the subjects of the seminar. It's role is therefore twofold:
One possible strukture for the report could be:
Recommended length is about 5 pages. Send seminar report to Kary Främling . |
This page is maintained by Kary Främling. Last updated on April 23rd, 2004 |