Books
- Reinforcement Learning: An Introduction, Richard S. Sutton & Andrew G. Barto], 2017 version
- Reinforcement Learning course by Rémi Munos
Videos
Nov. 25th
- Introduction
- Part I, Introduction Slides_Intro_RL.pdf
- Part II, Markov Decision Process Slides_MDP_RL.pdf
Dec. 2nd
- Multi-Armed Bandits and applications
- Slides Slides_MAB_RL.pdf
- Practical session Q-Learning, Value iteration, policy iteration
Dec. 9th
- Multi-Armed Bandits and applications
- Practical session Bandits
Dec 16th
- Multi-Armed Bandits, Monte-Carlo Tree Search and applications,
- slides: see Dec. 2nd.
- Practical session MCTS on Minesweeper
Jan 6th
- Continuous approximation
- slides Cours_RL_2020_FA.pdf
- Direct policy search
- slides Slides_Continuous_RL.pdf
Jan 13th
- Continuous approximation + Direct policy search
Oral seminar - Batiment 660, Amphi Shannon (Lundi 20 janvier, 14h -17h)
2:00 Dhiaeddoine Youssfi & Wafa Bouzouita: Deep Reinforcement Learning with Double Q-learning
2:20 Nicolas DEVATINE & Alban PETIT: Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
2:40 Ziheng LI & Xinneng XU: The Predictron: End-To-End Learning and Planning
3:00 Clément Veyssière & Eric Wang: Recent Advances in Imitation Learning from Observation
3:20 pause
3:40 Geremy Hutin: Skew-Fit: State-Covering Self-SupervisedReinforcement Learning
4:00 Matthieu Charreire, Adrien Lefebvre & Sarah Aamiri (attention, 22 mn d'exposé, vous êtes un trinome): Monte-Carlo Tree Search for Policy Optimization
4:30 Baptiste Merliot & Matthieu Nogatchewsky: Policy Improvement: Between Black-Box Optimization and Episodic Reinforcement Learning
4:50 Simon Monteiro: Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search
5:10 Florian Bertelli & Ramine Hamidi: Neural Program Synthesis By Self-Learning
Deep RL: from Atari to Go and beyond
- Human-level control through deep reinforcement learning Corentin Leloup et Maxime CHOR
- Deep Reinforcement Learning with Double Q-learning
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm Nicolas DEVATINE & Alban PETIT
- The Predictron: End-To-End Learning and Planning Ziheng LI & Xinneng XU.
Transfer RL
- Off-Policy Actor-Critic with Shared Experience Replay
- MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics
Imitation
- State Alignment-based Imitation Learning
- Recent Advances in Imitation Learning from Observation r Clément Veyssière et Eric Wang
- Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement
Rewards
- Skew-Fit: State-Covering Self-SupervisedReinforcement Learning Geremy Hutin
- Learning to solve the credit assignment problem
- Ranking Policy Gradient
Optimization for RL
- Monte-Carlo Tree Search for Policy Optimization Matthieu Charreire, Adrien Lefebvre et Sarah Aamiri
- Policy Improvement: Between Black-Box Optimization and Episodic Reinforcement Learning Baptiste Merliot & Matthieu Nogatchewsky
- Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search
- Samples Are Useful? Not Always: denoising policy gradient updates using variance explained
Other
- Neural Program Synthesis By Self-Learning Florian Bertelli and Ramine Hamidi