Do you remember what work was like before COVID-19? For many of us, the global pandemic acted as a catalyst that not only transformed how we work, communicate and collaborate but also momentarily reduced our carbon footprints. During the depths of the pandemic, as aviation, surface transport, and power generation decreased by 75%, 50%, and 15% respectively, emissions levels dropped to a level last observed in 2006.
Now that many of us are returning to the office, a recent study has confirmed what many of us suspected: working from home can have positive implications for the planet, although not in the ways we might imagine. Published in September 2023 in the journal Proceedings of the National Academy of Sciences, the paper found that “switching from working onsite to working from home can reduce up to 58% of work’s carbon footprint” in the USA, but only when climate-positive decisions are made about our use of information and communication technology, the energy we consume at work and at home, and the modes of travel we chose for commuting and leisure.
The findings point to a potential future that Dr. Hao Li has been working toward for many years, one in which the widespread public adoption of fully immersive, 3D telepresence transforms communication, entertainment, work, and education. “We are trying to improve the way that people communicate. One of the things we are working toward is to replace physical transportation with virtual teleportation. The other is using AI to achieve things that cannot be done in real life,” says the Associate Professor of Computer Vision at the Mohamed bin Zayed University for Artificial Intelligence (MBZUAI), the world’s first academic institution dedicated to research in artificial intelligence (AI).
The CEO and co-founder of Pinscreen, a startup that builds cutting-edge, AI-driven virtual avatar technologies, Li has spent more than 17 years conducting academic research that investigates the intersection between Computer Graphics, Computer Vision, and AI, with a particular focus on using neural networks, deep learning, and data-driven techniques to achieve dynamic geometry processing, virtual avatar creation, facial performance capture, and AI-driven digitisation of 3D shapes.
Li’s career has been defined by a close connection between academic research and product development. He not only lists Industrial Light & Magic, Lucasfilm Ltd., Disney Research Zurich, Oculus VR/Facebook and Netflix on his commercial client list, but his research has also helped to improve radiation treatment for cancer patients globally and helped develop the Animoji technology that was used by Apple’s iPhone X.
“How do we devolve technologies for digitising humans? How can we capture whole environments, so that people can be in one place but visit another to see what it looks like?” he asks. “And how can we develop these technologies in ways that are accessible, not just to experts but to everyone?” the academic asks.
Li points to the recent release of Google’s Universal Translator, which can re-dub film footage with a new translation that echoes the style and tone of the original complete with synchronized lip matching, as an example of an AI-driven technology that not only has the potential to transform how we watch and enjoy movies but the way that we can learn as well.
“It’s not just about achieving the Natural language processing (NLP) part, which allows us to translate one language into another. We’re also creating visuals so that we can create online courses where the teacher is able to speak any language at the press of a button. How can we make that happen, not just in real time but also in 3D?” the academic explains.
“Instead of having historians or archaeologists trying to describe how things looked in the past, we can now use data to reconstruct environments so that you can back into the past and experience and interact with it instead of just seeing it,” Li says. “The keyword here is virtual teleportation. It’s not only about people interacting, but also being in another place.”
On a tour of the MBZUAI Metaverse Lab where Li is the Director and supervises an international team of nine Ph.D. students, the academic demonstrates the Lab’s latest research, which involves using deep neural networks to visualise 3D environments in real-time using nothing more than a video captured on a cellphone.
The technique uses a well-established process called structure from motion, which enables the AI to understand where the camera is relative to the environment it is creating alongside deep neural networks that are able to infer what an environment is likely to look like when source imagery is incomplete or unavailable, effectively blurs the distinction between the real and the imagined.
“We’re building a telepresence technology that allows people to communicate, immersively, from any viewpoint in a moving space in 3D and that has a lot of potential applications because it allows me to go anywhere I want,” says Li, looking ahead to developments in education, training, testing and prototyping, all of which would be transformed by the deployment of fully immersive 3D communication.
“Obviously the technology is immediately beneficial to all sorts of computer games, but I could also visit a museum, for example, or a rental apartment and see what it’s like. This is a technology that has the potential to democratize the way immersive and interactive content is created.”
The Arabic language is underrepresented in the digital world, making AI inaccessible for many of its 400.....
Martin Takáč and Zangir Iklassov's 'self-guided exploration' significantly improves LLM performance in solving combinatorial problems.
A team from MBZUAI is improving LLMs' performance across languages by helping them find the nuances of.....