|
Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia, 2024, Volume 520, Number 2, Pages 57–70 DOI: https://doi.org/10.31857/S2686954324700383
(Mi danma588)
|
|
|
|
This article is cited in 1 scientific paper (total in 1 paper)
SPECIAL ISSUE: ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING TECHNOLOGIES
Unraveling the Hessian: a key to smooth convergence in loss function landscapes
N. S. Kiselev, A. V. Grabovoy Moscow Institute of Physics and Technology, Moscow, Russia
DOI:
https://doi.org/10.31857/S2686954324700383
Abstract:
The loss landscape of neural networks is a critical aspect of their training, and understanding its properties is essential for improving their performance. In this paper, we investigate how the loss surface changes when the sample size increases, a previously unexplored issue. We theoretically analyze the convergence of the loss landscape in a fully connected neural network and derive upper bounds for the difference in loss function values when adding a new object to the sample. Our empirical study confirms these results on various datasets, demonstrating the convergence of the loss function surface for image classification tasks. Our findings provide insights into the local geometry of neural loss landscapes and have implications for the development of sample size determination techniques.
Keywords:
neural networks, loss function landscape, Hessian matrix, convergence analysis, image classification.
Received: 28.09.2024 Accepted: 02.10.2024
Citation:
N. S. Kiselev, A. V. Grabovoy, “Unraveling the Hessian: a key to smooth convergence in loss function landscapes”, Dokl. RAN. Math. Inf. Proc. Upr., 520:2 (2024), 57–70; Dokl. Math., 110:suppl. 1 (2024), S49–S61
Linking options:
https://www.mathnet.ru/eng/danma588 https://www.mathnet.ru/eng/danma/v520/i2/p57
|
|