The search for an antidote to Byzantine attacks - MBZUAI MBZUAI

The search for an antidote to Byzantine attacks

Thursday, February 26, 2026

Say you are a chief information officer for a hospital, and you have been tasked with providing your physicians with a machine learning model that can help them interpret clinical data. You soon realize that this project faces some challenges. Machine learning models require huge amounts of data to train, and it will take years for your small, regional hospital to generate enough data to train a system that could benefit physicians. And while you could explore pooling your data with other hospitals, patient privacy restrictions forbid you from doing so.

Federated learning, an approach that distributes model training across a network of machines, offers a solution. With federated learning, data is stored locally at the hospital; updates to the local model — and not the patient data — are shared with a central server; and these learnings can then be shared with nodes across the network.

It’s a solution that institutions in medicine, finance, and other privacy-sensitive fields have been pursuing. But decentralization is both a strength and a weakness.

Federated learning typically assumes that nodes in the network provide quality information. This isn’t always the case. In a hospital setting, some doctors might be improperly trained and label data incorrectly, hurting the performance of the system across the network.

“There are some attacks that occur in neural networks due to unconscious mistakes,” such as poor labeling, explains Gleb Molodtsov, a research assistant at MBZUAI. “But there are other attacks where nodes in the network consciously disrupt the training process.”

Known as Byzantine attacks, these are named after a classic problem in distributed computing where some actors in a network send deliberately misleading signals to sabotage collective decision-making. Even a few malicious nodes can be enough to hinder training.

Molodtsov and colleagues from MBZUAI and other institutions propose a solution to Byzantine attacks, which they call Byzantine antidote, or Bant. What sets it apart from other defenses isn’t just that it prevents Byzantine attacks, but that it’s designed to work even when the majority of nodes in the network have been compromised.

Aleksandr Beznosikov, a co-author of the study, says: “We can find attacks in these scenarios that are due to poor annotations, and our method provides the opportunity to find poorly labeled data while still training the models.”

The researchers shared their study in an oral presentation at the 40th Annual AAAI Conference on Artificial Intelligence, which was held in Singapore. Daniil Medyakov, Sergey Skorik, Nikolas Khachaturov, Shahane Tigranyan, Vladimir Aletov, Aram Avetisyan, Martin Takáč, and Beznosikov are co-authors of the study.

Verify and trust

With Bant, the researchers built on previous approaches and combine two concepts — trust scores and a trial function — into a dynamic filtering system that can identify and neutralize corrupted updates.

A trial function is essentially a small, trusted dataset kept on the server that acts as a ground truth reference. When devices send their updates, the server tests each one to see if the update improves performance by reducing loss on the trusted dataset. Updates that help are weighted more heavily, while updates that seem to be pulling the model in the wrong direction are weighted less or ignored.

The trust score adds a kind of memory dimension to the process. Instead of making a fresh judgement about each node every time it sends an update, the system tracks behavior of the nodes over time. A node that has been repeatedly sending gradients that reduce the loss on the trusted dataset builds a positive trust score and that history is added to the weighting.

That said, even honest updates might increase loss at certain steps. Without accounting for this, the system might wrongly penalize a trustworthy device for a bad update. To address this, the researchers include what they call a momentum parameter that smooths things out across epochs so that short-term variations don’t override the bigger picture of whether a node is trustworthy or not.

Most methods that are designed to be robust against Byzantine attacks are built on the assumption that the majority of nodes in the network are honest. But this is a significant limitation because in a real-world scenario this can’t be guaranteed. Bant only requires a single reliable node to work.

“Other methods are driven by clustering or finding some mean of all workers, but when there are more bad workers than honest workers these approaches don’t work,” Molodtsov says.

Previous approaches have used the trial function concept but assumed that data across nodes is homogenous in terms of their statistical properties. In a hospital setting, this would be like assuming that every hospital sees the same kinds of patients with the same kinds of conditions. This isn’t the case in practice.

Alternatively, Bant operates under a data similarity assumption, which is weaker and more forgiving than previous approaches. It acknowledges that different nodes will have different data distributions but assumes that there’s still some meaningful relationship between them. Data from different hospitals aren’t identical, but they aren’t completely unrelated either.

Versions of Bant and results

The researchers developed three variants of their approach, each representing a different strategy for assigning trust scores to devices.

The first, Bant, evaluates each device’s gradient against the trial function and assigns weights based on how much that gradient reduces the trial loss, while the momentum parameter carries forward the trust history from previous rounds to smooth out noise. Molodtsov says that while “Bant is really good in practice, its imperfectness on the theoretical side made us want to develop the second method.”

The second version, called AutoBant, approaches the weight assessment as an optimization problem. It’s less stable in practice but is theoretically better than Bant, he explains.

The third variation, SimBant, doesn’t consider how much a device’s model reduces loss but how similar its predictions are to the server’s predictions on trial data.

The researchers tested their methods on benchmark datasets that included an image classification task, abnormality detection in ECG data, and a recommender system task. They compared their algorithms to others on different kinds of attacks, including label flipping, where attackers send gradients based on the loss calculated with randomly flipped labels, and inner product manipulation (IPM), where attackers send the average gradient of all honest clients multiplied by a factor that is negative.

On the ECG data, Bant performed the best of all methods on label flipping, random gradients, and a little is enough, where attackers average their gradients and scale the standard deviation to mimic the majority.

Under a 60% label flipping attack, Bant achieved a G-mean of 0.956 compared to ZENO (0.014) and ADAM (0.262), two other approaches that were tested. Under an 80% IPM attack, SimBant performed best (0.955) while most competitors fell below 0.20.

Molodtsov says that this research boosted his interest in federated learning and how to address some practical challenges that researchers face in pushing the field forward. “In our work we address one of the greatest problems in federated learning,” he says. “Our work can give rise to future directions and be a foundation for various tasks related to federated learning.”

Related

thumbnail
Thursday, December 25, 2025

AI and the silver screen: how cinema has imagined intelligent machines

Movies have given audiences countless visions of how artificial intelligence might affect our lives. Here are some.....

  1. cinema ,
  2. AI ,
  3. artificial intelligence ,
  4. art ,
  5. fiction ,
  6. science fiction ,
Read More
thumbnail
Tuesday, December 23, 2025

Mind meld: agentic communication through thoughts instead of words

A NeurIPS 2025 study by MBZUAI shows that tapping into agents’ internal structures dramatically improves multi-agent decision-making.

  1. agents ,
  2. neurips ,
  3. machine learning ,
Read More
thumbnail
Wednesday, December 10, 2025

Balancing the future of AI: MBZUAI hosts AI for the Global South workshop

AI4GS brings together diverse voices from across continents to define the challenges that will guide inclusive AI.....

  1. representation ,
  2. equitable ,
  3. global south ,
  4. AI4GS ,
  5. event ,
  6. workshop ,
  7. languages ,
  8. inclusion ,
  9. large language models ,
  10. llms ,
  11. accessibility ,
Read More