|
Problemy Peredachi Informatsii, 2022, Volume 58, Issue 2, Pages 66–91 DOI: https://doi.org/10.31857/S0555292322020065
(Mi ppi2369)
|
|
|
|
Automata Theory
Poissonian two-armed bandit: a new approach
A. V. Kolnogorov Yaroslav-the-Wise Novgorod State University
DOI:
https://doi.org/10.31857/S0555292322020065
Abstract:
We consider a new approach to the continuous-time two-armed bandit problem in
which incomes are described by Poisson processes. For this purpose, first, the control horizon
is divided into equal consecutive half-intervals in which the strategy remains constant, and
the incomes arrive in batches corresponding to these half-intervals. For finding the optimal
piecewise constant Bayesian strategy and its corresponding Bayesian risk, a recursive difference
equation is derived. The existence of a limiting value of the Bayesian risk when the number of
half-intervals grows infinitely is established, and a partial differential equation for finding it is
derived. Second, unlike previously considered settings of this problem, we analyze the strategy
as a function of the current history of the controlled process rather than of the evolution of
the posterior distribution. This removes the requirement of finiteness of the set of admissible
parameters, which was imposed in previous settings. Simulation shows that in order to find the
Bayesian and minimax strategies and risks in practice, it is sufficient to partition the arriving
incomes into 30 batches. In the case of the minimax setting, it is shown that optimal processing
of arriving incomes one by one is not more efficient than optimal batch processing if the control
horizon grows infinitely.
Keywords:
Poissonian two-armed bandit, Bayesian and minimax approaches, asymptotic main
theorem of the game theory, batch processing.
Received: 31.05.2021 Revised: 09.04.2022 Accepted: 18.04.2022
Citation:
A. V. Kolnogorov, “Poissonian two-armed bandit: a new approach”, Probl. Peredachi Inf., 58:2 (2022), 66–91; Problems Inform. Transmission, 58:2 (2022), 160–183
Linking options:
https://www.mathnet.ru/eng/ppi2369 https://www.mathnet.ru/eng/ppi/v58/i2/p66
|
|