In this talk, we delve into the intricate relationships between heavy-tailed distributions, generalization error, and algorithmic stability in the realm of noisy stochastic gradient descent. Recent research has illustrated the emergence of heavy tails in stochastic optimization and their intriguing links to generalization error. However, these studies often relied on challenging topological and statistical assumptions. Empirical evidence has further challenged existing theory, suggesting that the relationship between heavy tails and generalization is not always monotonic. In response, we introduce novel insights, exploring the relationship between tail behavior and generalization properties through the lens of algorithmic stability. Our analysis reveals that the stability of stochastic gradient descent (SGD) varies based on how we measure it, leading to interesting conclusions about its behavior.Expanding upon these findings, we extend the scope to a broader class of objective functions, including non-convex ones. Leveraging Wasserstein stability bounds for heavy-tailed stochastic processes, our research sheds light on the non-monotonic connection between generalization error and heavy tails, offering a more comprehensive perspective.Additionally, we introduce a unified approach for proving Wasserstein stability bounds in stochastic optimization, emphasizing time-uniform stability and its role in various scenarios, including convex and non-convex losses. Our approach is versatile and applicable to popular optimizers, highlighting the importance of ergodicity.
Post Talk Link: ClickHere
Passcode: 6&gMXvKm
I am a Marie Curie Fellow, jointly hosted by Inria, Paris, and UIUC, CSL, under the guidance of Prof. Francis Bach and Maxim Raginsky. My research focuses on the intersection of optimization and machine learning theory, with a particular emphasis on understanding the interplay between optimization techniques and generalization. I earned my PhD from the Max-Planck Institute for Intelligent Systems in Tübingen, under the supervision of Prof. Bernhard Schoelkopf. My research has been published in leading machine learning conferences such as ICML, NeurIPS, AISTATS, and COLT, among others. Before this, I completed my undergraduate and master's studies in electrical engineering at IIT Kanpur.
Read More
Read More