| 1 | Markov Decision Processes   Finite-Horizon Problems: Backwards Induction   Discounted-Cost Problems: Cost-to-Go Function, Bellman's Equation |  | 
| 2 | Value Iteration   Existence and Uniqueness of Bellman's Equation Solution   Gauss-Seidel Value Iteration |  | 
| 3 | Optimality of Policies derived from the Cost-to-go Function   Policy Iteration   Asynchronous Policy Iteration | Problem set 1 out | 
| 4 | Average-Cost Problems   Relationship with Discounted-Cost Problems   Bellman's Equation   Blackwell Optimality | Problem set 1 due | 
| 5 | Average-Cost Problems   Computational Methods |  | 
| 6 | Application of Value Iteration to Optimization of Multiclass Queueing Networks   Introduction to Simulation-based Methods Real-Time Value Iteration | Problem set 2 out | 
| 7 | Q-Learning
  Stochastic Approximations |  | 
| 8 | Stochastic Approximations: Lyapunov Function Analysis   The ODE Method   Convergence of Q-Learning |  | 
| 9 | Exploration versus Exploitation: The Complexity of Reinforcement Learning |  | 
| 10 | Introduction to Value Function Approximation   Curse of Dimensionality   Approximation Architectures |  | 
| 11 | Model Selection and Complexity | Problem set 3 out | 
| 12 | Introduction to Value Function Approximation Algorithms   Performance Bounds |  | 
| 13 | Temporal-Difference Learning with Value Function Approximation |  | 
| 14 | Temporal-Difference Learning with Value Function Approximation (cont.) |  | 
| 15 | Temporal-Difference Learning with Value Function Approximation (cont.)   Optimal Stopping Problems   General Control Problems |  | 
| 16 | Approximate Linear Programming | Problem set 4 out | 
| 17 | Approximate Linear Programming (cont.) |  | 
| 18 | Efficient Solutions for Approximate Linear Programming |  | 
| 19 | Efficient Solutions for Approximate Linear Programming: Factored MDPs |  | 
| 20 | Policy Search Methods | Problem set 5 out | 
| 21 | Policy Search Methods (cont.) |  | 
| 22 | Policy Search Methods for POMDPs   Application: Call Admission Control   Actor-Critic Methods |  | 
| 23 | Guest Lecture: Prof. Nick Roy   Approximate POMDP Compression |  | 
| 24 | Policy Search Methods: PEGASUS   Application: Helicopter Control |  |