Home / News / A new approach to identify LLM hallucinations: Uncertainty quantification presented at ACL

A new approach to identify LLM hallucinations: Uncertainty quantification presented at ACL

Friday, August 23, 2024

One of the limitations of large language models (LLMs) is that they tend to mix facts with fiction. This phenomenon, known as hallucination, makes it difficult for users to determine what is true and what is fabricated.

Although scientists are continually working to improve language models, eliminating hallucinations entirely may not be realistic, said Artem Shelmanov, a senior research scientist at the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI). “According to theoretical analyses of LLMs, it’s mathematically impossible to prevent hallucinations in all cases because of the way the models are designed and trained,” he said.

Shelmanov, along with colleagues at MBZUAI and other institutions, is coauthor of a study that proposes a new approach to identify hallucinations in LLMs with a method called uncertainty quantification. The research was presented at the 62nd Annual Meeting of the Association for Computational Linguistics (ACL) in Bangkok.

Maxim Panov, a coauthor of the study and assistant professor of machine learning at MBZUAI, said: “While developers may want to create models that never make mistakes, even the best models will inevitably make errors, and our approach is capable of identifying them.”

A measure of confidence

LLMs generate text based on principles of probability. When a user asks an LLM a question, the LLM replies with an answer that it deems is likely, considering the specifics of the question and the way the model was trained and tuned. LLMs generate text on the level of the token, which can be a word, a part of a word, or mark of punctuation. The model sequentially predicts the most likely next token in a string, calculating a probability for each token along the way. Some tokens have a high probability of appearing, while others have a low probability. These probabilities offer insight into the confidence a model has in the text it generates.

The researchers’ approach draws on this internal quantification of uncertainty at the level of the token to identify claims made by a model that may in fact be hallucinations. Claims with low confidence can be highlighted in the LLM user interface. The researchers call their approach claim conditioned probability, or CCP, and it is the first to use uncertainty quantification methods based only on the LLM that is generating the text.

There are other approaches to identifying hallucinations, such as automated factchecking that queries an external dataset or another LLM. But factchecking in that way is computationally intensive, explained Preslav Nakov, a coauthor on the study and department chair of natural language processing and professor of natural language processing at MBZUAI. CCP is efficient, since LLMs by necessity compute the probability of each token. “We only use internal probabilities generated by the model, and the model does this kind of computation anyway,” Nakov said.

While CCP doesn’t explicitly indicate whether a claim is factual, it can highlight passages where the model lacks confidence, significantly improving usability compared to today’s LLMs.

“We want to highlight parts of text which require additional checking, indicating to a user that there might be a problem with a passage,” Panov said. “People are using LLMs in many ways, and the amount of inaccurate information they produce is immense.”

Uncertain in more ways than one

The researchers’ approach is more complex than simply identifying cases in which an LLM is uncertain, because there are different kinds of uncertainty in LLMs.

LLMs can be uncertain about which synonym to use. A model can also be uncertain about in which order passages should be arranged. For example, when an LLM is asked to generate a biography, it could begin a response with the person’s place of birth, their date of birth or some other piece of information. These two types of uncertainty have no impact on factuality, however, and can be ignored.

To distinguish the different kinds of uncertainty in a model, the researchers used a technique called natural language inference, which makes it possible to identify uncertainty that relates to factuality. By filtering out irrelevant uncertainties, such as those arising from synonyms or ordering, the researchers can more accurately quantify the likelihood of a claim being incorrect. “We tried to purify the probability distribution and remove this redundant uncertainty from it,” Shelmanov said.

The researchers created a benchmark to compare CCP to other uncertainty quantification methods across different LLMs. The benchmark was comprised of machine-generated biographies that contained hallucinations. CCP performed better than the other methods on seven LLMs, including OpenAI’s GPT-3.5-turbo and GPT-4, and it did so in four languages: Arabic, Chinese, English and Russian.

CCP draws on previous research by Shelmanov, Panov and others, called LM-Polygraph, which provides a framework for estimating uncertainty in language models. LM-Polygraph is a library for developers and researchers that includes code for uncertainty quantification that can be integrated into LLMs.

Nakov noted one limitation to their method. As the real world is always changing and the method is based on the internal knowledge of the model not external facts, it’s always possible it could miss inaccurate claim. “But that is a different kind of problem,” he said.

Nonetheless, the researchers are continuing to advance CCP and determine ways that it can be added to LLMs and benefit users. “In terms of computational efficiency, we’re not at the point yet where it can be implemented on a large scale, but we are getting there,” Shelmanov said.

Wednesday, August 06, 2025

A new approach to identify LLM hallucinations: Uncertainty quantification presented at ACL

Related

How dialectal pretraining improves Arabic automatic speech recognition

How jailbreak attacks work and a new way to stop them

Measuring cultural commonsense in the Arabic-speaking world with a new benchmark

A new approach to identify LLM hallucinations: Uncertainty quantification presented at ACL

Related

How dialectal pretraining improves Arabic automatic speech recognition

How jailbreak attacks work and a new way to stop them

Measuring cultural commonsense in the Arabic-speaking world with a new benchmark

Subscribe to The Node