Physics Meets Machine Learning: Past, Present, and Future

Speaker: 
Yuhai Tu
Institution: 
IBM
Date: 
Thursday, February 27, 2025
Time: 
3:30 pm
Location: 
ISEB 1010

Most modern machine learning algorithms are based on artificial neural network (ANN) models originated from the marriage of two natural science disciplines—statistical physics and neuroscience. At their core, ANNs describe collective behaviors of a group of highly abstracted “neurons” interacting with each other in an adaptive way in a network that bears certain resemblance to the real neural network in the brain. Dynamics of ANNs effectively implement various computing algorithms, which allow them to compute and learn.

In the first part of this talk, we will give a brief historical account of the development of ANN-based machine learning paradigms. We will focus on explaining the foundational discoveries and inventions based on statistical physics as exemplified by the Hopfield model and the Boltzmann machine, which are the cited works for the recent Nobel Prize in Physics awarded to Hopfield and Hinton.

Next, we will describe our recent work in developing a theoretical foundation for feedforward deep-learning neural networks underlying the current AI revolution by using a statistical physics approach. In particular, we will discuss our recent work [1-2] on the learning dynamics driven by stochastic gradient descend (SGD) and the key determinants for generalization based on an exact duality relation we discovered between neuron activities and network weights [3].  Finally, we will discuss a few future directions that are worth pursuing by using physics-based approach, e.g., the neuro-scaling law observed in large language models and in-context-learning in transformer-based ANN models [4].

 

[1] “The inverse variance-flatness relation in Stochastic-Gradient-Descent is critical for finding flat minima”, Yu Feng and Yuhai Tu, PNAS, 118 (9), 2021.

[2] “Stochastic Gradient Descent Introduces an Effective Landscape-Dependent Regularization Favoring Flat Solutions”, Ning Yang, Chao Tang, and Yuhai Tu, Phys. Rev. Lett. (PRL) 130, 130 (23), 237101, 2023.

[3] “Activity–weight duality in feed-forward neural networks reveals two co-determinants

for generalization”, Yu Feng, Wei Zhang, Yuhai Tu, Nature Machine Intelligence, 2023.

[4] “Physics Meets Machine Learning: A Two-Way Street”, Herbert Levine and Yuhai Tu, PNAS, 121 (27), e240358012, 2024. 

Host: 
Jin Yu