rtpl

CMPUT 653 Fall 2021 Real-Time Policy Learning

This project is maintained by armahmood

CMPUT 653 Real-Time Policy Learning

Schedule

Syllabus

Term:

Fall, 2021

Lecture Date and Time:

MW 11:00 a.m. - 12:20 p.m.

Lecture Location:

NRE 2-127

Instructor:

Rupam Mahmood (armahmood@ualberta.ca)

Overview

When the input-output interface of a robot is determined, can we just deploy a general-purpose system for controlling the robot without extensive hand-engineering? Agents based on policy gradient methods are a candidate for such systems. However, they behave differently on a real robot than they do in standard simulations. In this course, we learn the foundations of policy gradient methods and focus on the characterizing differences between simulated and real-time policy learning. While discussing recent papers on policy gradient methods, we will scrutinize them in light of computational frugality and compatibility with real-time updates.

Objectives

Prerequisites

This course requires knowledge in basic probability theory, linear algebra, introductory reinforcement learning as well as experience programming deep neural networks using PyTorch in Python 3.

Course Topics

Course Work and Evaluation

Course Materials

Deep Policy Gradient Methods is a similar course given in Fall 2020. All course reading material will be available online. We will be using the following textbook extensively: Sutton and Barto, Reinforcement Learning: An Introduction, MIT Press. The book is available from the bookstore or online as a pdf here: http://www.incompleteideas.net/book/the-book-2nd.html

Academic Integrity

All assignments written and programming are to be done individually. No exceptions. Students must write their own answers and code. Students are permitted and encouraged to discuss assignment problems and the contents of the course. However, the discussion should always be about high-level ideas. Students should not discuss with each other (or tutors) while writing answers to written questions our programming. Absolutely no sharing of answers or code sharing with other students or tutors. All the sources used for problem solution must be acknowledged, e.g. web sites, books, research papers, personal communication with people, etc. The University of Alberta is committed to the highest standards of academic integrity and honesty. Students are expected to be familiar with these standards regarding academic honesty and to uphold the policies of the University in this respect. Students are particularly urged to familiarize themselves with the provisions of the Code of Student Behaviour and avoid any behaviour which could potentially result in suspicions of cheating, plagiarism, misrepresentation of facts and/or participation in an offence. Academic dishonesty is a serious offence and can result in suspension or expulsion from the University. (GFC 29 SEP 2003)