Fullscreen
Loading...
 
Tao
Print

Seminar24022015

February 24th

14:30 , R2014 Digiteo Shannon (660) (see location ):


Madalina Drugan (Vrije Universiteit Brussel, Belgium)




Title : Multi-objective multi-armed bandits


Abstract :


Multi-objective multi-armed bandits (MOMAB) paradigm extends the
multi-armed bandits (MAB) to reward vectors instead. MOMAB differs from
standard MAB in important ways since several arms are optimal according
to their reward tuples. Techniques from multi-objective optimisation are
used to create MOMAB algorithms with efficient exploration/exploitation
trade-off for complex and large multi-objective stochastic environments.
Theoretical analysis is an important aspect of MAB that is a simplified
theoretical framework of reinforcement learning with a single state. We
give an overview of the MOMAB algorithms, their analysis and the
corresponding experimental methodology.


Contact: cyril.furtlehner at inria.fr


Contributors to this page: furtlehn .
Page last modified on Friday 06 of March, 2015 10:17:29 CET by furtlehn.