CS 533 - Intelligent Agents and Decision Making
Winter 2009


Overview

In this course we will study models and algorithms for automated planning and decision making. The content will be divided into three main sections. First, we will study classical STRIPS planning where the environment is assumed to be deterministic and fully observable. We will cover the basic algorithmic paradigms including: partial-order planning, constraint-based planning (GraphPlan and Satplan), and heuristic-search planning. Next, we will study planning in the context of Markov decision processes (MDPs) where the environment is allowed to be stochastic. We will cover the basic theory and algorithms for explicit state-space MDPs. In addition, we will discuss exact and approximate techniques for factored state-space MDPs. Finally, we will study the basic theory and algorithms for reinforcement learning, where the agent is not given a model of the environment, but instead must learn to act in the world via exploration.

Learning Objectives of the Course:

1. Understand the primary paradigms and algorithms for propositional STRIPS planning.

2. Understand the basic theory and definitions of Markov decision processes.

3. Understand the basic algorithms for solving explicit state-space MDPs: value iteration, policy iteration, linear programming.

4. Understand sampling-based techniques for Markov decision processes.

5. Understand algorithms for solving factored Markov Decision Processes.

6. Understand basic reinforcement learning algorithms for explicit state-spaces, both value based and policy-search-based.

Exams

There will be two exams:
Exam I: TBA
Final exam: TBA

Assignments

The assignments in this course will consist of written problem sets, 2 mini-projects, and a final project.

Grades

The final grade will be calculated as follows: midterm 20%, final 30%, final project 20%, problem sets and mini-projects 30%. In all cases, however, the judgment of the instructor determines the final grade. In particular, class participation can affect a student's final grade.

Honor Code

Collaboration on assignments problems is permitted; copying of solutions or code is not. The work you hand in should be your own. Students should indicate on their homework, the names of all collaborators. While some students find studying together to be quite beneficial and enjoyable, I strongly encourage you to attempt to solve homework problems on your own first, as this is the only way to ensure that you have mastered the material. Generating solutions is much different than verifying solutions.