Today’s artificial intelligence (AI) can solve problems, shift between languages, and produce responses that feel increasingly human. But for all its fluency, a core question remains largely unanswered: how does it actually work?
Over the past two years, that question became the defining focus for MBZUAI master’s graduate Chenxi Wang. Rather that push models to do more, the Natural Language Processing student sought to understand what it already happening inside them – tracing the internal mechanisms that shape how large language models (LLMs) process language, reason across contexts, and generate emotionally-aligned responses.
“We are living in an era where AI capabilities are expanding at a pace that our understanding cannot keep up with,” she says. “Mechanistic interpretability is foundational to closing that gap, and I feel fortunate to be doing this work at this particular moment.”
Wang’s thesis, ‘Minds in Machines: Understanding LLM Internal Mechanisms through the Lens of Human Cognition’, asks whether the internal workings of these systems resemble anything like human thought. In many cases, her results suggest that they do.
“We know that AI has already demonstrated remarkable capabilities in mathematics, writing, and even emotional companionship,” she explains. “Yet, we barely understand how it does any of this. My thesis tries to open up these black boxes by borrowing lenses from human cognition.”
She approached this through three lines of investigation. The first looks at how models process disrupted language – the kind of scrambled text that humans can often read effortlessly. “Humans can effortlessly read words with scrambled letters, so how does AI process them?” she says. “What we found is that models rely primarily on word form information rather than broader sentence context to reconstruct meaning, and we were able to identify the specific attention heads responsible for that behavior.”
The second explores reasoning across languages. “When humans reason across language, there are underlying structures in the brain that support that process,” she says. “We wanted to know if AI has something similar.” Her work shows that language models do internalize logical structures tied to different languages – and that gaps in those structures can lead directly to drops in reasoning performance.
The third investigates the internal computational dynamics that give rise to emotionally-toned outputs in language models. “When humans experience emotions, there are corresponding neural mechanisms,” she says. “The question we asked was: are there analogous computational structures inside LLMs that underlie affective outputs?”
Her findings suggest there are. “We identified emotion-related circuits in LLMs – specific internal structures that are causally linked to how models produce emotionally-toned responses,” she says. “Being able to locate these circuits is a step toward understanding, from the inside, why models behave the way they do.”
The potential significance of this work continues to motivate Wang, who is exhilarated by the breakthroughs her research has produced.
“The greatest satisfaction for me comes from discovering emergent structures inside LLMs where others see only opacity,” she says. “Locating a neuron, tracing a circuit, and then realizing the model is doing something that mirrors human cognition in ways nobody has previously mapped – that feeling is difficult to replace with anything else.”
But the motivation is not just intellectual. “What makes this research truly meaningful is the importance of the work itself,” she continues.
“We are building systems that are increasingly integrated into daily life, yet we do not fully understand their internal mechanisms. If we want to move toward more advanced forms of intelligence, including AGI [artificial general intelligence], we need a deep understanding of how these systems work internally, in order to build AI that is safe, accountable, and aligned with human values.”
That connection between human cognition and machine intelligence is central to her thinking about the future of AI and, as she has learned, is an area of great interest for the general public.
“When we posted our emotion circuits paper on arXiv, it drew an enormous amount of attention on [social media platform] X – almost 300,000 impressions,” she says. “Seeing so many people – researchers and general audiences alike – captivated by the question of how emotional expression emerges from the internal mechanics of these systems made me more certain than ever that this direction is worth pursuing.
“Understanding these mechanisms is fundamental to building AI systems that are more transparent, more interpretable, and ultimately more trustworthy.”
Wang’s ability to pursue this line of research was shaped in large part by the positive and open environment that she found at MBZUAI.
“I would describe my journey here as my golden years – a chapter I will carry with me for the rest of my life,” she says. “Academically, what I valued most was the freedom. MBZUAI and my supervisor encouraged me to follow my own curiosity, and that freedom led to research that I am genuinely proud of.”
That independence translated quickly into output. Over two years, she produced multiple first-author papers at leading conferences, building what she describes as a clear and coherent research identity.
“My supervisor gave me enormous autonomy to pursue the questions that I truly cared about, rather than assigning me tasks,” she says. “That is what allowed me to develop a direction of my own, rather than just contributing to someone else’s.”
Beyond research, Wang describes her time in Abu Dhabi as transformative on a personal level.
“The friendships that I formed here are something I never expected,” she says. “My closest friends came from places as different as the UAE, Syria, and Russia, and our conversations about culture, about life, and about ideas gave me a much fuller picture of the world than I could have gotten anywhere else.”
That exposure helped to reshape how she thinks about AI itself.
“I stopped thinking about research within the frame of a single language or culture,” she explains. “Being surrounded by people from so many different backgrounds made me see AI as a truly global endeavor, and that directly influenced the kinds of questions that I chose to ask.”
Her connection to the region also deepened in more unexpected ways. “I fell genuinely in love with Arab culture here,” she says. “There is a richness and beauty to it that I had only glimpsed from a distance before, and being immersed in it every day was a gift that I did not take for granted.”
One of her favorite memories captures that sense of immersion. “Spending an afternoon with friends exploring abaya shops around the city – trying on dozens of styles, going from store to store, and ending up going home with five abayas in different colors – it was one of those spontaneous days that somehow captured everything I love about living here.”
Alongside her academic work, Wang’s trajectory has already extended beyond the University.
Her research has been presented at leading venues including NeurIPS, ACL, and EMNLP, with additional work under review at ICML. She has also secured a research internship with Alibaba’s Qwen base model team, which she will be joining over the summer.
“These opportunities came as a direct result of the work that I was able to do here,” she says. “The geography of MBZUAI – sitting between East and West – also made it easier to connect academic resources across different regions in ways that would have been difficult anywhere else.”
Capitalizing on this momentum, Wang will stay in Abu Dhabi to continue her research as a Ph.D. candidate – turning down an offer from Peking University in favor of MBZUAI. “That is perhaps the clearest sign of what this place means to me,” she says.
And in a world where understanding the inner workings of AI becomes more and more important, she will surely continue to make the kind of breakthroughs that MBZUAI is becoming increasingly known for – discovering how systems think, reason, and relate to the world around them.
After six years of remarkable research and global impact, the Class of 2026 valedictorian will stay on.....
MBZUAI President and University Professor, Eric Xing, celebrated the graduating Class of 2026 – hailing them as.....
MBZUAI's Class of 2026 is made up of 140 Ph.D. and Master's graduates across Computer Science, Computer.....