Fullscreen
Loading...
 
Tao
Print

Seminar02072019

Tuesday, 2nd of July

14h30 (room R2014, 660 building) (see location )

Reda Alami

(Orange / LRI)

Memory bandits for decision-making in dynamic environments. Application to 5G optimization.

In this talk, we build the next generation of multi-armed bandits for the non-stationary environment. We call them Memory Bandits. They are a combination between a MAB solver (Thompson Sampling, KLUCB, Bayes UCB,...) and the Bayesian Online change-point detector. We also present a modified version of this detector which is easier to analyze in term of false alarm and detection delay. Then, we present two industrial applications of multi-armed bandit in the context of 5G optimization. Finally, we introduce the decentralized exploration problem in the multi-armed bandit paradigm with a first generic solution called decentralized elimination



Contact: guillaume.charpiat at inria.fr
All TAU seminars: here


Contributors to this page: guillaume .
Page last modified on Thursday 27 of June, 2019 15:32:27 CEST by guillaume.