April, Wednesday 25th

14:30 (room 2014, 'Digiteo Shannon' 660 building) (see location )

Joon Kwon


Title: Mirror descent strategies for regret minimization and approachability


First, we present the online linear optimization model, which is a general framework for regret minimization. We construct a family of Mirror Descent strategies with time-varying parameters. We establish regret bounds for those strategies. The time-varying parameters allow to recover as special cases a large number of known strategies: Exponential Weights Algorithm, Smooth Fictitious Play, Vanishingly Smooth Fictious Play, as well as Follow the Perturbed Leader.

Second, we add the assumption that payoff vectors have at most $s$ nonzero components, and aim at determining the optimal regret bounds. We notice that a fundamental difference between gains and losses then appears.

Third, we present a general method for constructing strategies for Blackwell's approachability problem, based on regret minimizing strategies. The unifying character of this approach is illustrated by the construction of optimal strategies for online combinatorial optimization and internal/swap regret minimization. Besides, we establish that Blackwell's strategy is a special case of the family of strategies thus constructed.

Finally, we study the problem of approachability with partial monitoring (i.e. where the decision maker only observes random signals). We establish that the optimal convergence rate is $T^{-1/3}$.

Contact: guillaume.charpiat at inria.fr
All TAU seminars: here

Contributors to this page: guillaume .
Page last modified on Monday 09 of April, 2018 23:25:52 CEST by guillaume.