Decision Making and Reinforcement Learning
About this Course
This course is an introduction to sequential decision making and reinforcement learning. We start with a discussion of utility theory to learn how preferences can be represented and modeled for decision making. We first model simple decision problems as multi-armed bandit problems in and discuss several approaches to evaluate feedback. We will then model decision problems as finite Markov decision processes (MDPs), and discuss their solutions via dynamic programming algorithms. We touch on the notion of partial observability in real problems, modeled by POMDPs and then solved by online planning methods. Finally, we introduce the reinforcement learning problem and discuss two paradigms: Monte Carlo methods and temporal difference learning. We conclude the course by noting how the two paradigms lie on a spectrum of n-step temporal difference methods. An emphasis on algorithms and examples will be a key part of this course.Created by: Columbia University
Related Online Courses
The course aims at helping students to be able to solve practical ML-amenable problems that they may encounter in real life that include: (1) understanding where the problem one faces lands on a... more
In this 1.5 hour guided project you will learn to use basic roadmaps in Jira, creating epics and issues, planning and visualizing timelines and assignments, editing progress and dependencies,... more
This course is the culmination of the Graphic Design Specialization and gives you an opportunity to tie together your knowledge and skills into a single project: a brand development guide for a... more
In this 2-hour project-based course, you will learn the basics of building persuasive sales presentations using PowerPoint and Copilot AI. We will do this by creating a methodology-based sales... more
Good system management not only requires managing the systems themselves, but requires careful planning to make systems interact with each other, auditing of the systems once the systems are built,... more