2017 Module Reinforcement Learning, Michele Sebag, Diviyan Kalainathan, Laurent Cetinsoy


21 février 2018, Amphithéatre Shannon, batiment 660
23 février 2018, salle 2014, 2e etage, batiment Shannon, 660


Examen de l'an dernier


Todo: code, experiments, analysis, written report.

Copy-paste of existing programs on the Net will have consequences, which could include receiving a mark of 0.

Projects involve at most 3 students except for the last two (Halite & Alesia: 4 students).
Projects are due on February, 15th, 23:59 GMT+1.

Each group must produce :
  1. A report of circa 2 pages (max 3 pages without references), TeX and .pdf files, including a description of the approach, results and comparison with other algorithms/state of the art (when possible), using the ICML 2017 format. People not able to write TeX can produce a .doc(x) document, with its .pdf.( Description | ICML2017 TeX package )
  2. The code of your implemented approach. This code should work "out of the box", add a notice/readme giving the list of required packages/libraries, special notes if needed. Producing a code taken from the internet, with none or little modifications could lead to unwanted consequences.

You can discuss about your project's problems/ideas, and ask for more information at : diviyan (at) lri (dot) fr

The subjects are the following (increasing difficulty):
  1. Mountain car problem (compare two approaches)
  2. Inverted pendulum (compare two representations of the problem)
  3. The acrobot
  4. Octopus
  5. Td-gammon
  6. bicycle: equilibrium + advancing
  7. Anti-Imitation Policy learning: reproduce an experiment from mainDIVA.pdf
  8. halite.io
  9. Jeu d'Alesia (voir Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games, ICML 15)Alesia_game.zip


  1. Video Richard 2016 Sutton, https://www.microsoft.com/en-us/research/video/tutorial-introduction-to-reinforcement-learning-with-function-approximation/
  2. some videos of the Boston Dynamics group


  1. Exam écrit 12 février
  2. TP notés
  3. Projets

13 nov, Michele Sebag

20 nov. MS + DK

27 nov. DK

4 dec. pas cours

11 dec. MS

8 jan MS + DK

15 jan.

24 jan.

Jour à fixer, présentation d'articles

  1. Neural Optimizer Search with Reinforcement Learning ICML 2017
  2. Boosted Fitted Q-Iteration
  3. Constrained Policy Optimization
  4. Curiosity-driven Exploration by Self-supervised Prediction Pauline Brunet et Quentin Bouchut
  5. The K-armed Dueling Bandits Problem Zizhao Li et Xudong Zhang
  6. DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
  7. Coordinated Multi-Agent Imitation Learning Ghiles SIDI SAID et Amine BIAD
  8. Local Bayesian Optimization of Motor Skills
  9. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning , Thomas Gauthier, Pereira Abou Rejaili Rodrigo
  10. Designing Neural Network Architectures using Reinforcement Learning Mohamed Ali Dargouth & Walid Belrhalmia___
  11. Robot gains Social Intelligence through Multimodal Deep Reinforcement Learning Eden Belouadah
  12. Abstraction Selection in Model-based Reinforcement Learning
  13. Universal Value Function Approximators
  14. Deterministic Policy Gradient Algorithms
  15. Dynamic Programming Boosting for Discriminative Macro-Action Discovery

Deep RL: DQN, AlphaZeroGo?, AlphaZero?

  1. Playing Atari with Deep Reinforcement Learning Vincent Boyer et Ludovic Kun
  2. [|], Zhengying Liu
  3. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , Thomas Foltete et Guillaume Collin
  4. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games Adrien Pavao et Eleonor Bartenlian

Contributors to this page: sebag .
Page last modified on Friday 23 of February, 2018 11:09:16 CET by sebag.