Chargement...

Reinforcement Learning, Michele Sebag and Heri Rakotoarison

Books

Reinforcement Learning: An Introduction, Richard S. Sutton & Andrew G. Barto], 2017 version
Reinforcement Learning course by Rémi Munos

Videos

Nov. 25th

Introduction
- Part I, Introduction Slides_Intro_RL.pdf
- Part II, Markov Decision Process Slides_MDP_RL.pdf

Dec. 2nd

Multi-Armed Bandits and applications
- Slides Slides_MAB_RL.pdf
Practical session Q-Learning, Value iteration, policy iteration

Dec. 9th

Multi-Armed Bandits and applications
Practical session Bandits

Dec 16th

Multi-Armed Bandits, Monte-Carlo Tree Search and applications,
- slides: see Dec. 2nd.
Practical session MCTS on Minesweeper

Jan 6th

Continuous approximation
- slides Cours_RL_2020_FA.pdf
Direct policy search
- slides Slides_Continuous_RL.pdf

Jan 13th

Continuous approximation + Direct policy search
- Practical session

Oral seminar - Batiment 660, Amphi Shannon (Lundi 20 janvier, 14h -17h)

2:00 Dhiaeddoine Youssfi & Wafa Bouzouita: Deep Reinforcement Learning with Double Q-learning

2:20 Nicolas DEVATINE & Alban PETIT: Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2:40 Ziheng LI & Xinneng XU: The Predictron: End-To-End Learning and Planning

3:00 Clément Veyssière & Eric Wang: Recent Advances in Imitation Learning from Observation

3:20 pause

3:40 Geremy Hutin: Skew-Fit: State-Covering Self-SupervisedReinforcement Learning

4:00 Matthieu Charreire, Adrien Lefebvre & Sarah Aamiri (attention, 22 mn d'exposé, vous êtes un trinome): Monte-Carlo Tree Search for Policy Optimization

4:30 Baptiste Merliot & Matthieu Nogatchewsky: Policy Improvement: Between Black-Box Optimization and Episodic Reinforcement Learning

4:50 Simon Monteiro: Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search

5:10 Florian Bertelli & Ramine Hamidi: Neural Program Synthesis By Self-Learning

Deep RL: from Atari to Go and beyond

Human-level control through deep reinforcement learning Corentin Leloup et Maxime CHOR
Deep Reinforcement Learning with Double Q-learning
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm Nicolas DEVATINE & Alban PETIT
The Predictron: End-To-End Learning and Planning Ziheng LI & Xinneng XU.

Transfer RL

Imitation

State Alignment-based Imitation Learning
Recent Advances in Imitation Learning from Observation r Clément Veyssière et Eric Wang
Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Rewards

Optimization for RL

Monte-Carlo Tree Search for Policy Optimization Matthieu Charreire, Adrien Lefebvre et Sarah Aamiri
Policy Improvement: Between Black-Box Optimization and Episodic Reinforcement Learning Baptiste Merliot & Matthieu Nogatchewsky
Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search
Samples Are Useful? Not Always: denoising policy gradient updates using variance explained

Other

Neural Program Synthesis By Self-Learning Florian Bertelli and Ramine Hamidi

Fichiers 5