Yuanzhi Li

Affiliated Assistant Professor of Machine Learning

Research interests

Li’s primary research area is deep learning theory, focusing on (1) understanding the hierarchical feature learning process in neural networks and how it’s better than shallow learning methods; (2) how the choice of optimization algorithms affects the training speed of different types of neural networks, and how it influences the generalization of the learned solution; (3) how to use pre-trained neural networks in downstream applications more effectively.

Email

Prior to joining MBZUAI, Li was a postdoctoral researcher at Stanford and is an assistant professor in the Carnegie Mellon University (CMU) Department of Machine Learning.

In 2023, Li was selected for a prestigious Sloan Research Fellowship in computer science by the Alfred P. Sloan Foundation. The Fellowship is awarded in honor of extraordinary researchers whose creativity, innovation, and research accomplishments make them stand out as the next generation of leaders.

The fellowship recognizes creative, early-career researchers in seven scientific and technical fields: chemistry, computer science, Earth system science, economics, mathematics, neuroscience and physics. Li was one of only 22 scholars selected in computer science for 2023.

  • Ph.D. in computer science from Princeton University, USA.
  • Bachelor of computer science and mathematics from Tsinghua University, China.
  • Publication Yuanzhi Li

Li has authored or co-authored more than 50 research papers with over 5600 citations.

  • A convergence theory for deep learning via over-parameterization. Z Allen-Zhu, Y Li, Z Song. International Conference on Machine Learning, 242-252, 2019.
  • Learning and generalization in overparameterized neural networks, going beyond two layers. Z Allen-Zhu, Y Li, Y Liang. Advances in neural information processing systems 32, 2019.
  • Convergence analysis of two-layer neural networks with relu activation. Y Li, Y Yuan. Advances in neural information processing systems 30, 2017.
  • Learning overparameterized neural networks via stochastic gradient descent on structured data. Y Li, Y Liang. Advances in Neural Information Processing Systems 31, 2018.
  • A theoretical analysis of NDCG ranking measures. Y Wang, L Wang, Y Li, D He, W Chen, TY Liu. Proceedings of the 26th annual conference on learning theory (COLT 2013) 8, 6, 2013.

Contact faculty affairs

Interested in working with our renowned faculty?
Fill out the below form and we will get back to you.