A. V. Kolnogorov, “Poissonian two-armed bandit: a new approach”, Probl. Peredachi Inf., 58:2 (2022), 66–91; Problems Inform. Transmission, 58:2 (2022), 160

Problemy Peredachi Informatsii

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Impact factor
	Guidelines for authors

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Probl. Peredachi Inf.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Problemy Peredachi Informatsii, 2022, Volume 58, Issue 2, Pages 66–91
DOI: https://doi.org/10.31857/S0555292322020065 (Mi ppi2369)

Automata Theory

Poissonian two-armed bandit: a new approach

A. V. Kolnogorov

Yaroslav-the-Wise Novgorod State University

Full-text PDF (336 kB) Citations (1)

References:

PDF

HTML

DOI: https://doi.org/10.31857/S0555292322020065

Abstract: We consider a new approach to the continuous-time two-armed bandit problem in which incomes are described by Poisson processes. For this purpose, first, the control horizon is divided into equal consecutive half-intervals in which the strategy remains constant, and the incomes arrive in batches corresponding to these half-intervals. For finding the optimal piecewise constant Bayesian strategy and its corresponding Bayesian risk, a recursive difference equation is derived. The existence of a limiting value of the Bayesian risk when the number of half-intervals grows infinitely is established, and a partial differential equation for finding it is derived. Second, unlike previously considered settings of this problem, we analyze the strategy as a function of the current history of the controlled process rather than of the evolution of the posterior distribution. This removes the requirement of finiteness of the set of admissible parameters, which was imposed in previous settings. Simulation shows that in order to find the Bayesian and minimax strategies and risks in practice, it is sufficient to partition the arriving incomes into 30 batches. In the case of the minimax setting, it is shown that optimal processing of arriving incomes one by one is not more efficient than optimal batch processing if the control horizon grows infinitely.

Keywords: Poissonian two-armed bandit, Bayesian and minimax approaches, asymptotic main theorem of the game theory, batch processing.

Funding agency	Grant number
Russian Foundation for Basic Research	20-01-00062
Supported in part by the Russian Foundation for Basic Research, project no. 20-01-00062.

Received: 31.05.2021
Revised: 09.04.2022
Accepted: 18.04.2022

English version:
Problems of Information Transmission, 2022, Volume 58, Issue 2, Pages 160–183
DOI: https://doi.org/10.1134/S0032946022020065

Bibliographic databases:

Document Type: Article

UDC: 621.391.1 : 519.713 : 517.977.5

Language: Russian

Citation: A. V. Kolnogorov, “Poissonian two-armed bandit: a new approach”, Probl. Peredachi Inf., 58:2 (2022), 66–91; Problems Inform. Transmission, 58:2 (2022), 160–183

Citation in format AMSBIB

\Bibitem{Kol22}

\by A.~V.~Kolnogorov

\paper Poissonian two-armed bandit: a new approach

\jour Probl. Peredachi Inf.

\yr 2022

\vol 58

\issue 2

\pages 66--91

\mathnet{http://mi.mathnet.ru/ppi2369}

\mathscinet{https://mathscinet.ams.org/mathscinet-getitem?mr=4460458}

\edn{https://elibrary.ru/DZKOCQ}

\transl

\jour Problems Inform. Transmission

\yr 2022

\vol 58

\issue 2

\pages 160--183

\crossref{https://doi.org/10.1134/S0032946022020065}

Linking options:

https://www.mathnet.ru/eng/ppi2369

https://www.mathnet.ru/eng/ppi/v58/i2/p66

This publication is cited in the following 1 articles:

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Registration to the website

Logotypes