Exposes
21 février 2018, Amphithéatre Shannon, batiment 660 |
23 février 2018, salle 2014, 2e etage, batiment Shannon, 660 |
https://docs.google.com/spreadsheets/d/1eidQleMOdpmXbr3tGaKUdhM0avHUGwBQTtW-lh2QrL4/edit?usp=sharing
Examen de l'an dernier
- 2016-2017 AIC_RL_Exam_16.pdf
Projects
Todo: code, experiments, analysis, written report.
Copy-paste of existing programs on the Net will have consequences, which could include receiving a mark of 0.
Projects involve at most 3 students except for the last two (Halite & Alesia: 4 students).
Projects are due on February, 15th, 23:59 GMT+1.
Each group must produce :
- A report of circa 2 pages (max 3 pages without references), TeX and .pdf files, including a description of the approach, results and comparison with other algorithms/state of the art (when possible), using the ICML 2017 format. People not able to write TeX can produce a .doc(x) document, with its .pdf.( Description | ICML2017 TeX package )
- The code of your implemented approach. This code should work "out of the box", add a notice/readme giving the list of required packages/libraries, special notes if needed. Producing a code taken from the internet, with none or little modifications could lead to unwanted consequences.
You can discuss about your project's problems/ideas, and ask for more information at : diviyan (at) lri (dot) fr
The subjects are the following (increasing difficulty):
- Mountain car problem (compare two approaches)
- Inverted pendulum (compare two representations of the problem)
- The acrobot
- Octopus
- Td-gammon
- bicycle: equilibrium + advancing
- Anti-Imitation Policy learning: reproduce an experiment from mainDIVA.pdf
- halite.io
- Jeu d'Alesia (voir Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games, ICML 15)Alesia_game.zip
Pointers
- Video Richard 2016 Sutton, https://www.microsoft.com/en-us/research/video/tutorial-introduction-to-reinforcement-learning-with-function-approximation/
- some videos of the Boston Dynamics group
Evaluation
- Exam écrit 12 février
- TP notés
- Projets
13 nov, Michele Sebag
- Generalities RL_2017_Cours1.pdf
- Value functions RL_2017_Cours2.pdf
20 nov. MS + DK
- Value functions followed, Model-free settings RL_2017_Cours3.pdf
27 nov. DK
4 dec. pas cours
11 dec. MS
8 jan MS + DK
- Multi-armed bandits: revised course RL_2017_Cours4_revised.pdf
15 jan.
- Cours Function Approximation Cours_RL_15_Jan_2018.pdf
24 jan.
- Cours Mehdi Khamassi
- Cours Direct Policy Search RL_2017_Cours5.pdf
Jour à fixer, présentation d'articles
- Neural Optimizer Search with Reinforcement Learning ICML 2017
- Boosted Fitted Q-Iteration
- Constrained Policy Optimization
-
Curiosity-driven Exploration by Self-supervised PredictionPauline Brunet et Quentin Bouchut -
The K-armed Dueling Bandits ProblemZizhao Li et Xudong Zhang - DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
-
Coordinated Multi-Agent Imitation LearningGhiles SIDI SAID et Amine BIAD - Local Bayesian Optimization of Motor Skills
-
Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning, Thomas Gauthier, Pereira Abou Rejaili Rodrigo -
Designing Neural Network Architectures using Reinforcement LearningMohamed Ali Dargouth & Walid Belrhalmia___ -
Robot gains Social Intelligence through Multimodal Deep Reinforcement LearningEden Belouadah - Abstraction Selection in Model-based Reinforcement Learning
- Universal Value Function Approximators
- Deterministic Policy Gradient Algorithms
- Dynamic Programming Boosting for Discriminative Macro-Action Discovery
Deep RL: DQN, AlphaZeroGo, AlphaZero
Playing Atari with Deep Reinforcement LearningVincent Boyer et Ludovic Kun-
[|], Zhengying Liu -
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, Thomas Foltete et Guillaume Collin -
Deep Reinforcement Learning from Self-Play in Imperfect-Information GamesAdrien Pavao et Eleonor Bartenlian