A new strategy for complex optimization problems in machine learning presented at ICLR

Thursday, May 23, 2024

Machine learning models learn by analyzing data with optimization algorithms, such as gradient descent, which iteratively minimize the errors the models make on a given training set. It’s an important procedure, as machine learning models are being employed more and more across the spectrum of human activity, from designing medicines, to self-driving cars.

A recent study by researchers at Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI) and other institutions proposes new methods of handling complex optimization problems in machine learning, particularly those involving constraints that are faced in many different conditions.

The study was presented at the Twelfth International Conference on Learning Representations (ICLR 2024), which was held recently in Vienna. The research was a collaboration between Bin Gu and Huan Xiong, both Assistant Professors of Machine Learning at MBZUAI, and Xinzhe Yuan of ISAM. This analysis builds on a previous examination by William de Vazelhes, a Ph.D. student in machine learning at MBZUAI and co-author of the study.

Learning from experience

De Vazelhes is interested in studying and improving the optimization of machine learning models in a variety of different settings. “I think it’s fascinating that a machine can learn from experience instead of hard-coded rules,” de Vazelhes said.

He initially became interested in machine learning through an initiative that is now famous in the artificial intelligence community: Deepmind, an artificial intelligence company that was later acquired by Google, developed models that could devise effective approaches to playing games on the Atari video game system. In some cases, the models surpassed human performance.

An important concept in machine learning is convergence, which plays a role in optimizing these systems, as it helps to assure that an algorithm can reach a quality solution. But there are a lot of unknowns when embarking on building a machine learning model to address a task.

For a given machine learning model and dataset, using the wrong setup for the optimization algorithm may cause the model to learn little from data, or to learn but only after an enormous number of steps. “With all machine learning models, we have to know how a model will converge, if it will take a week, a month or a million years,” de Vazelhes said.

In the study presented at ICLR, the researchers combine two approaches, which they call zeroth-order hard-thresholding. By doing so, they fused the two techniques of zeroth-order methods with hard-thresholding, with the goal of tackling specific settings that arise in machine learning.

The approach can be used to address optimization problems where the mathematical formula describing the model’s objective functions — which are ways of quantifying what a model is trying to achieve — are not available, and where the solution needs to be sparse, meaning that a user would want to enforce many values in the model to be zero. Sparsity helps simplify models and can make them more interpretable by people.

One aspect of the work that has drawn de Vazelhes to optimization is that it sits at the convergence of math and coding. “Often in machine learning, some methods work, but it is not perfectly clear why,” he said. “Our approach is rigorous because under some assumptions we can provide actionable predictions on how the changes we make affect the performance, and if the data doesn’t match our theory, we can come up with more realistic assumptions.”

This proposed new algorithm offers improved convergence rates, which is a way of measuring how quickly a system comes to a solution, and broad applicability by overcoming challenges that are faced by zeroth or hard-thresholding alone.

The study also contributes both theoretical insights through convergence analysis and practical applications by demonstrating effectiveness in real-world tasks, like adversarial attacks in images that are designed to confuse object detection applications.

Related

thumbnail
Tuesday, April 15, 2025

New test that recovers hidden relationships in data to be presented at ICLR

Boyang Sun explains how his team's research into variables addresses a fundamental problem in machine learning and.....

  1. ICLR ,
  2. machine learning ,
  3. research ,
  4. statistics ,
  5. conference ,
  6. data ,
  7. variables ,
Read More
thumbnail
Monday, March 24, 2025

MBZUAI and Berkeley explore the future of machine learning

Machine learning pioneer Michael I. Jordan was among the speakers discussing the cutting-edge ideas shaping the field.

  1. workshop ,
  2. berkeley ,
  3. ML ,
  4. collaborations ,
  5. innovation ,
  6. research ,
  7. machine learning ,
Read More
thumbnail
Tuesday, March 18, 2025

Culturally Yours: A new tool for understanding cultural references in text

Researchers from MBZUAI have developed a tool that uses demographic information to help bridge linguistic and cultural.....

  1. COLING 2025 ,
  2. linguistics ,
  3. languages ,
  4. culture ,
  5. llms ,
  6. large language models ,
  7. research ,
Read More