|
This article is cited in 9 scientific papers (total in 10 papers)
Automata Theory
Gaussian two-armed bandit: limiting description
A. V. Kolnogorov Department of Applied Mathematics and Information Science,
Yaroslav-the-Wise Novgorod State University, Novgorod, Russia
Abstract:
For a Gaussian two-armed bandit, which arises when batch data processing is
analyzed, the minimax risk limiting behavior is investigated as the control horizon $N$ grows
infinitely. The minimax risk is searched for as the Bayesian one computed with respect to
the worst-case prior distribution. We show that the highest requirements are imposed on the
control in the domain of “close” distributions where mathematical expectations of incomes
differ by a quantity of the order of $N^{-1/2}$. In the domain of “close” distributions, we obtain a
recursive integro-difference equation for finding the Bayesian risk with respect to the worst-case
prior distribution, in invariant form with control horizon one, and also a second-order partial
differential equation in the limiting case. The results allow us to estimate the performance of
batch processing. For example, the minimax risk corresponding to batch processing of data
partitioned into $50$ batches can be only $2\%$ greater than its limiting value when the number of
batches grows infinitely. In the case of a Bernoulli two-armed bandit, we show that optimal
one-by-one data processing is not more efficient than batch processing as $N$ grows infinitely.
Keywords:
Gaussian two-armed bandit, minimax and Bayesian approaches, batch processing,
asymptotic minimax theorem.
Received: 06.04.2020 Revised: 02.06.2020 Accepted: 02.06.2020
Citation:
A. V. Kolnogorov, “Gaussian two-armed bandit: limiting description”, Probl. Peredachi Inf., 56:3 (2020), 86–111; Problems Inform. Transmission, 56:3 (2020), 278–301
Linking options:
https://www.mathnet.ru/eng/ppi2323 https://www.mathnet.ru/eng/ppi/v56/i3/p86
|
|