Imagine there is a room filled with people and you want to describe the distribution of their ages. One way to do this would be to say that 10% of people in the room are younger than 20 years old, or that half are younger than 38. These are what are known as quantiles, one of the most useful tools in statistics.
Quantiles work well when measuring a single variable. But add another variable, such as salary, and the task becomes more difficult. If one person is older but makes less money than someone who is younger, who ranks higher? There’s no obvious answer. “There is no natural ordering in high-dimensional spaces,” explains Vladimir Kondratyev, a first-year doctoral student in Machine Learning at MBZUAI.
Kondratyev is co-author of a study that proposes a solution to address this problem. He and his co-authors use a mathematical framework known as optimal transport to make multivariate quantiles computable with neural networks. They then apply the quantile maps they generate to a technique called conformal prediction to produce coverage guarantees in multivariate settings.
The research will be presented at the 14th International Conference on Learning Representations in Rio de Janeiro. His co-authors are Alexander Fishkov, Nikita Kotelevskii, Mahmoud Hegazy, Rémi Flamary, Maxim Panov, and Eric Moulines.
The key insight driving Kondratyev and his co-authors’ research is that multivariate quantiles can be determined by using optimal transport. Rather than trying to impose an arbitrary ordering on high-dimensional data, optimal transport treats the quantile function as a map between a reference distribution and the actual data. This provides a geometrically meaningful way to rank points in high-dimensional space, from typical to extreme, that behaves like ordinary ranking in one dimension.
The theoretical foundations for optimal transport were developed by other researchers but translating the theory into an operation that a computer could perform on data hadn’t been fully explored. “Modeling multidimensional quantiles because it’s mathematically complex,” Kondratyev says. “But it’s extremely useful for all different scenarios, whether that’s for financial markets, demand modeling, or statistics generally.”
In the study, the researchers describe a “neural optimal transport framework” that uses input-convex neural networks to estimate continuous vector quantile maps and multidimensional ranks, and combines them with conformal prediction. To make training efficient, they use a technique called amortized optimization. The result is a framework that scales to dozens of dimensions while preserving theoretical guarantees that make quantiles so useful in the first place.
“Building on previous theoretical work, we have developed a numerical way to realize optimal transport with a neural network that we use to universally create a quantile function in high-dimensional spaces,” Kondratyev says.
To determine if their neural optimal transport framework could learn the geometry of complex distributions, the researchers tested it on several synthetic benchmarks, each designed to challenge it in different ways.
The “banana” dataset presents a parabola-shaped distribution that shifts and curves depending on a random, latent variable. The “star” dataset is a three-pointed star that rotates as a latent variable changes. The “glasses” dataset is bimodal, where the two separate clusters shift position. Each of these distributions has a structure that other approaches would struggle to capture because the shape of the distribution changes depending on the context.
Measured against competing methods using a metric known as Wasserstein distance, which measures how far apart two distributions are, the framework consistently matched or performed better than baselines, including earlier vector quantile regression methods that don’t use neural networks. When the team tested their framework on Neal’s Funnel, a distribution designed to become more difficult as dimensionality increases, performance held up as the number of dimensions grew from two to 16.
Central to conformal prediction is what’s known as a conformity score, a measure of how extreme or outlying a data point is in relation to the full data distribution. The challenge in multivariate settings has been defining a meaningful conformity score when the data lives in multiple dimensions. Multivariate quantile rank maps solve this problem by providing a score that conformal prediction can then use to construct prediction sets.
Conformal prediction in multidimensional settings has been done before, but in a different way that tends to produce prediction sets that are larger than they need to be. By grounding their approach in multivariate quantile ranks, the researchers’ framework produces prediction sets that are tighter than other approaches.
Kondratyev says that one potential application of their approach is in medical image segmentation. Some current approaches to segmentation apply one-dimensional conformal methods pixel by pixel. A multivariate approach could instead produce a prediction mask that is guaranteed to cover a certain percentage of a cancerous region. Kondratyev says that this idea would of course need to be tested to see if it improved performance, but it’s one path forward.
Another relates to large-language models. When an LLM generates text, it samples from a high-dimensional distribution of possible next tokens. In principle, a multivariate quantile function could offer a new way of characterizing and sampling from that distribution. “To be feasible, we would have to scale our approach to an infinite number of dimensions,” he says, “but that’s the direction we are headed and it’s an interesting area to explore.”
On World Creativity and Innovation Day, we look at what creativity means at MBZUAI, how it is.....
MBZUAI researchers developed DP-Fusion, a method that protects sensitive data while preserving AI output quality.
MBZUAI graduate and psychotherapist Aigerim Zhumabayeva believes artificial intelligence can mirror feelings – but empathy still belongs.....
Read More