Two weak assumptions, one strong result presented at ICLR

Thursday, May 01, 2025

A team of researchers from MBZUAI and other institutions are authors of a study that proposes a new machine-learning method for uncovering hidden variables from observed data.

The lead author of the study is Zijian Li, a postdoctoral researcher at MBZUAI. Li studies an area of machine learning called disentangled representation learning that seeks to identify hidden, or latent, variables that play a role in processes that happen all around us but can’t be observed directly.

Li and his colleagues presented their findings at the 13th International Conference on Learning Representations (ICLR), held in Singapore. Shunxing Fan, Guangyi Chen, and Kun Zhang of MBZUAI are coauthors of the study.

Newton, Mendel, machine learning

Disentangled representation learning grew out of a related field called causal discovery, which focuses on determining causal relations in observed variables. Li explains, however, that it’s necessary to go one step further and identify latent variables because our understanding of the world has often been advanced by identifying hidden causal relationships from observations.

Newton inferred the unseen force of gravity from an apple falling from a tree. Mendel inferred genetic rules from observable traits passed from one plant to another. Similarly, disentangled representation learning aims to identify hidden factors that shape the world we can see, Li says.

Why identifiability matters

Li and other researchers in disentangled representation learning build models that provide what are known as identifiability guarantees. This describes the ability of a model to recover true latent variables, rather than arbitrary variables that don’t play a role in a specific causal process.

These guarantees are important because they ensure a model learns meaningful latent variables that reflect real-world processes, not just arbitrary patterns in data. Their goal is, therefore, to develop systems that generate identifiability guarantees while solving problems efficiently.

To do this, however, researchers must make decisions about how their model will interpret data and build assumptions into it. This often requires making choices between what are known as strong and weak assumptions.

Strong assumptions can make the mathematical foundations of a model cleaner and can guarantee identifiability. But models that use strong assumptions can struggle with real-world datasets due to their variability. If data violate strong assumptions even slightly, the models can break down or produce misleading results. Weak assumptions are more practical for real-world data but make it harder to guarantee identifiability.

A major focus for researchers in the field, therefore, is to find the minimal assumptions that are necessary for identifiability. In their study, Li and colleagues do this by combining the benefits of two weak assumptions.

Two are better than one

The researchers call their approach “complementary gains,” and it combines an assumption known as sufficient changes with another called sparse mixing procedure.

These assumptions have been used before but never combined in this way. “We wanted to know if it was possible to leverage both types of constraints in a complementary, principled way to learn disentangled representations with identifiability guarantees,” Li says.

The sufficient changes assumption says that if certain conditions, like time of day, change across a dataset, those changes can help reveal latent variables. Sparse mixing assumes that each observed variable is influenced by a small subset of latent variables, rather than by all latent variables.

Other methods often require at least 2n+1 domains to guarantee identifiability of n latent variables. But sufficient changes reduces the number of domains that are necessary.

“The sparse mixing procedure and sufficient changes assumption complement each other,” Li says. “When one assumption is partially violated, the other can compensate.”

The team provides a theoretical explanation of their model in the study. They also implemented the model using two neural network architectures: one with a variational autoencoder (CG-VAE) and another with a generative adversarial network (CG-GAN). The researchers compared their systems to others on synthetic and real-world datasets and found that CG-GAN performed better than the other systems, including CG-VAE.

Implications for future discoveries

Li says that the hidden world that Newton and Mendel uncovered is what he wants to explore with disentangled representation learning. “Advances in this field have the potential to help us in the future with automatic scientific discovery and that is one of the reasons why I find this meaningful,” he noted.

Related

thumbnail
Monday, April 28, 2025

Identifying bias in generative music models: A new study presented at NAACL

A team from MBZUAI tested the performance of models in non-Western music, with the aim of making.....

  1. generative AI ,
  2. NAACL ,
  3. music ,
  4. culture ,
Read More
thumbnail
Wednesday, April 23, 2025

Uncovering causal relationships in multimodal biological data: A new framework presented at ICLR

Yuewen Sun's research seeks to identify latent causal variables that underlie different modalities and the relations between.....

  1. machine learning ,
  2. ML ,
  3. ICLR ,
  4. biology ,
  5. multimodal ,
  6. variables ,
  7. casual relationships ,
Read More