Lifelong learning with the metaverse

Tuesday, December 27, 2022

Imagine having access to the most talented teachers, irrespective of where you live or your financial circumstances, or collaborating with colleagues remotely via 3D avatars. Researchers at MBZUAI’s Metaverse Lab are developing next-generation AI algorithms to create photorealistic virtual humans, digitize dynamic environments, and create 2D and spatial content.

But that’s just the start. Hao Li, Associate Professor of Computer Vision, and Director of MBZUAI Metaverse Lab says that, in the future, we will learn about history and other cultures by experiencing certain past events as if we were there.

Li works at the intersection between computer vision, computer graphics, and machine learning, with focus on virtual humans, reality capture, and AI synthesis. His goal is to enable new AI and immersive technologies that can make the concept of the metaverse possible, and enhance our lives with digital experiences that are otherwise not possible in the physical world. He is also developing tools to prevent new forms of cyberthreats such as deepfakes used for disinformation campaigns or harassment.

As well as being a faculty member at MBZUAI, Li is CEO and co-founder of Pinscreen, a startup that builds cutting edge AI-driven virtual avatar technologies. He was previously a Distinguished Fellow of the Computer Vision Group at UC Berkeley and Associate Professor of Computer Science at the University of Southern California, a visiting professor at Weta Digital, a research lead at Industrial Light & Magic / Lucasfilm, and a postdoctoral fellow at Columbia and Princeton universities. He was speaker at the World Economic Forum in Davos in 2020 and exhibited at SXSW in 2022.

Related

thumbnail
Friday, June 13, 2025

When models see what isn’t there: Reducing hallucinations with FarSight

MBZUAI researchers shared FarSight at CVPR, showcasing its ability to improve the performance of multimodal large language.....

  1. MLLMs ,
  2. multimodal ,
  3. hallucination ,
  4. CVPR ,
  5. llms ,
  6. computer vision ,
Read More
thumbnail
Thursday, June 12, 2025

A compact multimodal model for real-time video understanding on edge devices

Mobile-VideoGPT uses efficient token projection to enhance a model’s efficiency while maintaining high performance.

  1. computer vision ,
  2. multimodal ,
  3. edge devices ,
  4. GPT ,
Read More
thumbnail
Thursday, June 12, 2025

A new vision-language model for analyzing remote sensing data | CVPR

Researchers at MBZUAI have developed EarthDial, a new VLM that can handle a range of complex geospatial.....

  1. geospatial ,
  2. remote sensing ,
  3. Vision language model ,
  4. data ,
  5. VLM ,
  6. CVPR ,
  7. llms ,
  8. dataset ,
  9. computer vision ,
Read More