09:00 AM
Registration and Coffee & Tea!
09:30 AM
From Diffusion Models to Schrödinger BridgesValentin De Bortoli (Google DeepMind (on leave CNRS))
Diffusion models have revolutionized generative modeling. Conceptually, these methods define a transport mechanism from a noise distribution to a data distribution. Recent advancements have extended this framework to define transport maps between arbitrary distributions, significantly expanding the potential for unpaired data translation. However, existing methods often fail to approximate optimal transport maps, which are theoretically known to possess advantageous properties. In this talk, we will show how one can modify current methodologies to compute Schrödinger bridges—an entropy-regularized variant of dynamic optimal transport. We will demonstrate this methodology on a variety of unpaired data translation tasks.
10:10 AM
Multi-modal Foundation Models for BiologyThomas Pierrot (InstaDeep)
The human genome sequence provides the underlying code for human biology. Since the sequencing of the human genome 20 years ago, a main challenge in genomics has been the prediction of molecular phenotypes from DNA sequences alone. Models that can “read” the genome of each individual and predict the different regulatory layers and cellular processes hold the promise to better understand, prevent and treat diseases. Here, we introduce the Nucleotide Transformer (NT), an initiative to build robust and general DNA foundation models that learn the languages of genomic sequences and molecular phenotypes. NT models, ranging from 100M to 2.5B parameters, learn transferable, context-specific representations of nucleotide sequences, and can be fine-tuned at low cost to solve a variety of genomics applications. In this talk, we will share insights about how to construct robust foundation models to encode genomic sequences and how to validate them. We will also present recent advancements of our group including a study of the performance of such models on protein tasks as well as our ongoing progress towards more general genomics AI agents that integrate different modalities and have improved transfer capabilities. The training and application of such foundational models in genomics can provide a widely applicable stepping stone to bridge the gap of accurate predictions from DNA sequence and makes a step towards building a virtual cell.
10:50 AM
Coffee & Tea Break
11:00 AM
Towards the Alignment of Geometric and Text Latent SpacesMaks Ovsjanikov (Google DeepMind & École Polytechnique)
Recent works have shown that, when trained at scale, uni-modal 2D vision and text encoders converge to learned features that share remarkable structural properties, despite arising from different representations. However, the role of 3D encoders with respect to other modalities remains unexplored. Furthermore, existing 3D foundation models that leverage large datasets are typically trained with explicit alignment objectives with respect to frozen encoders from other representations. In this talk I will discuss some results on the alignment of representations obtained from uni-modal 3D encoders compared to text-based feature spaces. Specifically, I will show that it is possible to extract subspaces of the learned feature spaces that have common structure between geometry and text. This alignment also leads to improvement in downstream tasks, such as zero shot retrieval. Overall, this work helps to highlight both the shared and unique properties of 3D data compared to other representations.
11:40 AM
A Primer on Physics-informed Machine LearningGérard Biau (Sorbonne University)
Physics-informed machine learning typically integrates physical priors into the learning process by minimizing a loss function that includes both a data-driven term and a partial differential equation (PDE) regularization. Building on the formulation of the problem as a kernel regression task, we use Fourier methods to approximate the associated kernel, and propose a tractable estimator that minimizes the physics-informed risk function. We refer to this approach as physics-informed kernel learning (PIKL). This framework provides theoretical guarantees, enabling the quantification of the physical prior’s impact on convergence speed. We demonstrate the numerical performance of the PIKL estimator through simulations, both in the context of hybrid modeling and in solving PDEs. Additionally, we identify cases where PIKL surpasses traditional PDE solvers, particularly in scenarios with noisy boundary conditions. Joint work with Francis Bach (Inria, ENS), Claire Boyer (Université Paris-Saclay), and Nathan Doumèche (Sorbonne Université, EDF R&D).
12:20 PM
GFlowNets: A Novel Framework for Diverse Generation in Combinatorial and Continuous SpacesSalem Lahlou (MBZUAI)
Generative Flow Networks offer a framework for sampling from reward-proportional distributions in combinatorial and continuous spaces. They provide an alternative to established methods such as MCMC that suffer from slow mixing in high-dimensional spaces. By leveraging flow conservation principles, GFlowNets enable exploration in scenarios where the diversity of solutions is crucial, differing from traditional reinforcement learning and generative models. The framework has shown practical utility in molecular design, protein structure prediction, and Bayesian network discovery, particularly when dealing with noisy reward landscapes where maintaining sample diversity is essential. Recent works have also explored GFlowNets as a mechanism for improving the systematic exploration capabilities of large language models. This talk will present the theoretical foundations of GFlowNets and discuss current research directions in expanding their applications.
13:00 PM
Lunch
14:00 PM
What's not an Autoregressive LLM?Lingpeng Kong (University of Hong Kong)
This talk explores alternatives to autoregressive Large Language Models (LLMs), with a particular focus on discrete diffusion models. The presentation covers recent advances in non-autoregressive approaches to text generation, reasoning, and planning tasks. Key developments discussed include Reparameterized Discrete Diffusion Models (RDMs), which show promising results in machine translation and error correction, and applications of discrete diffusion to complex reasoning tasks like countdown games, Sudoku, and chess. The talk also examines sequence-to-sequence text diffusion models, as well as the novel Diffusion of Thoughts (DoTs) framework for chain-of-thought reasoning. These non-autoregressive approaches demonstrate competitive performance while offering potential advantages in terms of parallel processing and flexible generation patterns compared to traditional autoregressive models.
14:40 PM
Causal Representation Learning and Generative AIKun Zhang (MBZUAI)
Causality is a fundamental notion in science, engineering, and even in machine learning. Uncovering the causal process behind observed data can naturally help answer 'why' and 'how' questions, inform optimal decisions, and achieve adaptive prediction. In many scenarios, observed variables (such as image pixels and questionnaire results) are often reflections of the underlying causal variables rather than being causal variables themselves. Causal representation learning aims to reveal the underlying hidden causal variables and their relations. In this talk, we show how the modularity property of causal systems makes it possible to recover the underlying causal representations from observational data with identifiability guarantees: under appropriate assumptions, the learned representations are consistent with the underlying causal process. We demonstrate how identifiable causal representation learning can naturally benefit generative AI, with image generation, image editing, and text generation as particular examples.
15:20 PM
Coffee & Tea Break
15:30 PM
Factuality Challenges in the Era of Large Language Models: Can we Keep LLMs Safe and Factual?Preslav Nakov (MBZUAI)
We will discuss the risks, the challenges, and the opportunities that Large Language Models (LLMs) bring regarding factuality. We will then delve into our recent work on using LLMs for fact-checking, on detecting machine-generated text, and on fighting the ongoing misinformation pollution with LLMs. We will also discuss work on safeguarding LLMs, and the safety mechanisms we incorporated in Jais-chat, the world's best open Arabic-centric foundation and instruction-tuned LLM, based on our Do-Not-Answer dataset. Finally, we will present a number of LLM fact-checking tools recently developed at MBZUAI: (i) LM-Polygraph, a tool to predict an LLM's uncertainty in its output using cheap and fast uncertainty quantification techniques, (ii) Factcheck-Bench, a fine-grained evaluation benchmark and framework for fact-checking the output of LLMs, (iii) Loki, an open-source tool for fact-checking the output of LLMs, developed based on Factcheck-Bench and optimized for speed and quality, (iv) OpenFactCheck, a framework for fact-checking LLM output, for building customized fact-checking systems, and for benchmarking LLMs for factuality, and (v) LLM-DetectAIve, a tool for machine-generated text detection.
15:50 PM
Variational Diffusion Posterior Sampling with Midpoint GuidanceYazid Janati (FX Conseil)
Diffusion models have recently shown considerable potential in solving Bayesian inverse problems when used as priors. However, sampling from the resulting denoising posterior distributions remains a challenge as it involves intractable terms. To tackle this issue, state-of-the-art approaches formulate the problem as that of sampling from a surrogate diffusion model targeting the posterior and decompose its scores into two terms: the prior score and an intractable guidance term. While the former is replaced by the pre-trained score of the considered diffusion model, the guidance term has to be estimated. In this paper, we propose a novel approach that utilises a decomposition of the transitions which, in contrast to previous methods, allows a trade-off between the complexity of the intractable guidance term and that of the prior transitions. We also show how the proposed algorithm can be extended to handle the sampling of arbitrary unnormalised densities. We validate the proposed approach through extensive experiments on linear and nonlinear inverse problems, including challenging cases with latent diffusion models as priors.
16:10 PM
Demonstration-Regularized RL and RLHFDaniil Tiapkin (École Polytechnique)
Incorporating expert demonstrations has empirically helped to improve the sample efficiency of reinforcement learning (RL). This paper quantifies theoretically to what extent this extra information reduces RL's sample complexity, such as supervised fine-tuning data in the reinforcement learning from human feedback (RLHF) pipeline. In particular, we study the demonstration-regularized reinforcement learning that leverages the expert demonstrations by KL-regularization for a policy learned by behavior cloning. Our findings reveal that using N expert demonstrations enables the identification of an optimal policy at a sample complexity of order O(Poly(dim)/(ε^2 N)) in finite and linear MDPs, where ε is the target precision and dim is a problem dimensionality: number. Finally, we establish that demonstration-regularized methods are provably efficient for reinforcement learning from human feedback (RLHF). In this respect, we provide theoretical evidence showing the benefits of KL-regularization for RLHF in tabular and linear MDPs. Interestingly, we avoid pessimism injection by employing computationally feasible regularization to handle reward estimation uncertainty, thus setting our approach apart from the prior works.