Seminars
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
Calendar
Search
Add a seminar

RSS
Forthcoming seminars




Steklov Mathematical Institute Seminar
October 16, 2025 16:00, Moscow, Steklov Mathematical Institute of RAS, Conference Hall (8 Gubkina)
 


Physical Principles in Machine Learning: How to Explain Grokking

S. V. Kozyrev

Steklov Mathematical Institute of Russian Academy of Sciences, Moscow

S. V. Kozyrev
Photo Gallery



Abstract: Physics-like models in learning theory will be discussed.
Grokking (delayed generalization) is a phenomenon in the learning theory for overparameterized systems (i.e., systems with a large number of parameters) for algorithmic learning problems (e.g., learning multiplication). During grokkng, the system quickly memorizes the training set (e.g., half of the multiplication table), but initially gives incorrect answers to the test set (the other half of the multiplication table). Then, as the stochastic gradient descent procedure continues, grokking (delayed generalization) occurs — the system begins to give correct answers to questions from the test set.
In this talk, stochastic gradient descent will be considered as Brownian motion, and grokking will be explained as a manifestation of the second law of thermodynamics and Eyring's formula in kinetic theory.
The presentation will follow the preprint S. V. Kozyrev, How to explain grokking, arXiv:2412.18624.
 
  Contact us:
 Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2025