In this talk, we introduce our recent research on noise intialization and sampling enhancement strategies of diffusion models, spaning from image generation to video generation. Diffusion models is a class of mainstream generative modeling method for modern AIGC. Diffusion models can gradually denoise a Gaussian noise into a clean generated result, where such random Gaussian noise can naturally lead to diversitified results. What can this motivate us? We try to understand why some noise initializations are “golden noises” that can generate better results, and leverage such “golden noise” insight to further enhance diffusion sampling quality. This talk includes four parts: 1) Golden Noise for Text-to-Image Diffusion Models; 2) Zigzag Diffusion Sampling; 3) Smooth Initializations for Video Diffusion Models; 4) Leveraging Image Diffusion Models for Enhanced Video Synthesis.
Post Talk Link: Click Here
Passcode: v6.Sv#yE
Dr. Zeke Xie is an Assistant Professor at Information Hub, Hong Kong University of Science and Technology (Guangzhou). He is leading Xie Machine Learning Foundations Lab (xLeaF Lab) that generally interested in understanding and solving fundamental issues of modern AI, particularly large models, by scientific principles and methodology. He currently focuses on optimization and inference of large models and generative AI. Previously, he was a researcher at Baidu Research responsible for large models and AIGC research. He obtained Ph.D. and M.E. both from The University of Tokyo. He was fortunate to be advised by Prof. Issei Sato and Prof. Masashi Sugiyama. He received multiple faculty research awards from the industry, including ByteDance and Baidu. He also regularly serves as a reviewer/PC for ICML, NeurIPS, ICLR, CVPR, ICCV, ECCV, AAAI, TPAMI, Neural Computation, TNNLS, etc.
Read More
Read More