

Principle Seminar of the Department of Probability Theory, Moscow State University
October 19, 2011 16:45, Moscow, MSU, auditorium 1624






Robust Parallel Control in a Random Environment (the TwoArmed Bandit Problem)
A. V. Kolnogorov^{} ^{} Novgorod State University

Abstract:
The problem of expedient behavior in a stationary environment which is also wellknown as the twoarmed bandit problem is considered in robust (minimax) setting. Minimax strategy and risk are found as Bayes' ones corresponding to the worst prior distribution. For environments which incomes have normal distributions with unit variances and expectations depending on applied alternatives only this prior distribution can be chosen a symmetric and asymptotically uniform one.
A parallel control strategy is proposed which provides arbitrary close to optimal control. An invariant recurrent equation is obtained for finding the minimax strategy and minimax risk by dynamic programming method. This allows to improve wellknown W.Vogel's estimates of the minimax risk. A numerical analysis shows that the strategy performs well in stationary environments which distributions are different from normal ones, e.g. in binary Bernoulli environments.
