Reinforcement Learning Algorithms for MDPs
Loading...
Date
Author(s)
Citation for Previous Publication
Link to Related Item
Abstract
Description
Technical report TR09-13. This article presents a survey of reinforcement learning algorithms for Markov Decision Processes (MDP). In the first half of the article, the problem of value estimation is considered. Here we start by describing the idea of bootstrapping and temporal difference learning. Next, we compare incremental and batch algorithmic variants and discuss the impact of the choice of the function approximation method on the success of learning. In the second half, we describe methods that target the problem of learning to control an MDP. Here online and active learning are discussed first, followed by a description of direct and actor-critic methods. | TRID-ID TR09-13
Item Type
http://purl.org/coar/resource_type/c_93fc
Alternative
Other License Text / Link
Subject/Keywords
Artificial Intelligence
Monte-Carlo methods
Reinforcement learning
Actor-critic methods
Stochastic approximation
Markov decision processes
Active learning
Overfitting
L:east-sqares methods
Temporal difference learning
Simulations
Policy gradient
Two-timescale stochastic approximation
Q-learning
Online learning
Function approximation
Natural gradient
PAC-learning
Machine Learning
Planning
Stochastic gradient methods
Stimulation optimization
Bias-variance tradeoff
Monte-Carlo methods
Reinforcement learning
Actor-critic methods
Stochastic approximation
Markov decision processes
Active learning
Overfitting
L:east-sqares methods
Temporal difference learning
Simulations
Policy gradient
Two-timescale stochastic approximation
Q-learning
Online learning
Function approximation
Natural gradient
PAC-learning
Machine Learning
Planning
Stochastic gradient methods
Stimulation optimization
Bias-variance tradeoff
Language
en
