Springe zum Hauptinhalt

Deep Reinforcement Learning

Deep Reinforcement Learning

General Information

Suggested prerequisites: Mathematics I to IV, Neurocomputing, basic knowledge in Python.

Exam: written examination (90 minutes), 5 ECTS.

Contact: julien dot vitay at informatik dot tu-chemnitz dot de.

Language: English. The exam can of course be done in German.

Exam WS 2019-2020

Consultation on 12.2.20 at 13:45 (room 368a).

Oral exams on 19.2.20 and 20.2.20 (room 348).

Some guidelines to prepare: (pdf).


The course will dive into the field of deep reinforcement learning. It starts with the basics of reinforcement learning (Sutton and Barto, 2017) before explaining modern model-free architectures (DQN, DDPG, PPO) making use of deep neural networks for function approximation. More "exotic" forms of RL are then presented (successor representations, hierarchical RL, inverse RL, etc).

The different algorithms presented during the lectures will be studied in more details during the exercises, through implementations in Python.

The preliminary plan of the course is:

  1. Reinforcement Learning (MDP, dynamic programming, Monte-Carlo methods, temporal difference)
  2. Value-based deep RL (DQN)
  3. Policy gradient methods (A3C, DDPG, TRPO, PPO)
  4. Model-based RL (Dyna Q, AlphaGo, I2A)
  5. Successor representations
  6. Hierarchical RL
  7. Inverse RL
  8. Multi-agent RL



  • How do I register for the course?
    You can register on OPAL:
  • How do I register for the exam?
    Registration on SBService happens in December. Only registered students can participate to the exam.
  • I cannot assist to the exercises. Can I take the exam anyway?
    Yes. The exercises are there to help you understand the concepts seen in the lectures and get practical experience with neural networks. But they are not obligatory for the exam.
  • Do I have to memorize all these equations?
    No, but to understand them, which is basically the same.


  1. Introduction
    1. Introduction to reinforcement learning (html, pdf)
    2. Statistics (html, pdf)
  2. Tabular Reinforcement Learning
    1. Evaluative Feedback (html, pdf)
    2. Markov Decision Processes (html, pdf)
    3. Dynamic Programming (html, pdf)
    4. Monte-Carlo and temporal difference (html, pdf)
    5. Function approximation (html, pdf)
  3. Model-free deep RL
    1. Deep learning basics (html, pdf)
    2. Value-based methods (DQN) (html, pdf)
    3. Policy Gradient (html, pdf)
    4. Advantage Actor-Critic (A3C) (html, pdf)
    5. Deterministic Policy Gradient (DDPG) (html, pdf)
    6. Natural gradients (TRPO, PPO) (html, pdf)
  4. Model-based deep RL
    1. Model-based RL (html, pdf)
    2. AlphaGo (html, pdf)
  5. Outlook (html, pdf)


Instructions to setup a virtual environment for Python: pdf

  1. Introduction to Python and Numpy: notebook (zip) , solution (zip)
  2. n-armed bandits: notebook (zip) , solution (zip)
  3. Dynamic programming: notebook (zip) , solution (zip)
  4. Gym environments: notebook (zip) , solution (zip)
  5. Monte-Carlo control: notebook (zip) , solution (zip)
  6. Q-learning: notebook (zip) , solution (zip)
  7. Introduction to keras: notebook (zip) , solution (zip)
  8. DQN: notebook (zip) , solution (zip)