When a Saharan dust wall barrels across the Arabian Peninsula, children stay inside only if officials know it’s coming. Trouble is, the historic data behind most AI models contains very few examples of these extreme weather events, so the algorithms learn to ignore them. “The spikes are rare but deadly,” says Salman Khan, Associate Professor of Computer Vision at MBZUAI. “Traditional loss functions let the model smooth over them.”
Particulate matter such as PM2.5 and PM10, as well as sand smaller than a human hair, already kills an estimated 6.7 million people a year, according to the World Health Organization. Yet real‑time alerts still rely heavily on physics simulators that run for hours on HPC clusters, or on statistical baselines such as ‘whatever you measured yesterday, assume it again tomorrow.’ Neither handles sudden dust storms well.
Khan’s group at the University, and former MBZUAI research associate Vishal Nedungadi (now a Ph.D. student at Wageningen University & Research in the Netherlands) set out to fuse two data universes (weather and chemistry) inside a single neural net.
The result, AirCast, treats a global map of 20‑plus variables (temperature, wind, ozone, carbon monoxide and more) as if it were a giant image stack, and recently won the best paper award at the TerraBytes workshop held during the International Conference on Machine Learning (ICML) in Vancouver. You can view the paper here, and the code here.
AirCast includes a Vision Transformer that encodes those ‘pixels,’ while a dual‑head decoder spits out tomorrow’s weather and pollutant levels in one shot. Because all those channels would normally swamp GPU memory, the team added a variable‑aggregation layer that compresses each grid cell into a compact vector before feeding it into the transformer; think JPEG for climate tensors. “That cut our training bill to four A100 GPUs for four hours,” Nedungadi says. “A city server can handle inference daily.”
The secret sauce is something the research team calls frequency‑weighted MAE. Instead of treating every error equally, the model penalizes mistakes on high‑pollution days far more than on clean ones, borrowing an idea from class‑balanced losses in computer vision. In tests, that tweak alone shaved 4% off the root‑mean‑square error for PM2.5.
So does AirCast actually work? On a benchmark that includes 24‑hour forecasts across the Middle East and North Africa, a region that experiences high levels of dust, AirCast cut PM2.5 error by 33% versus a ‘persistence’ baseline and trounced the gold‑standard CAMS physics model, which was optimized for global, not regional, accuracy.
For example, on October 29, 2017, a massive storm blanketed Saudi Arabia. CAMS under‑predicted the particulate spike by up to 30 µg/m³; AirCast missed by less than 10 µg/m³, which means it would’ve given health agencies a cleaner picture of what was coming.
Surprisingly, the biggest accuracy gains came from the inputs measured close to the ground. Such variables include the 2‑meter temperature, the 10‑meter wind speed or surface‑level humidity. “Higher‑altitude readings are great for aviation, but ground truth lives near people’s lungs,” Nedungadi explains.
From Abu Dhabi dashboards to global health, there are several applications of AirCast:
Khan envisions a REST API that apps like Uber or Apple Weather could ping for hyper‑local air scores. “If cities open their sensor feeds, we can further fine‑tune AirCast on the in-situ measurements of localities within days,” he says.
AirCast shines at 24 hours; beyond that, shifting emissions inventories and chaotic weather blow up uncertainty. Extending to 72 hours means bigger context windows and real‑time updates of industrial output. Satellite aerosol optical‑depth data and crowdsourced IoT monitors could help, but only if the model learns to filter noise.
Geographically, the system still stumbles in East Asia and North America, where pollution sources differ. Fine‑tuning with local data closed part of the gap, but truly global generalization remains “an open research frontier,” Khan notes.
“By ICML 2030, air‑quality forecasts will finally be good enough to let cities prevent extreme events, not just react,” Nedungadi says. Imagine automated traffic rules that reroute diesel trucks when AirCast flashes red, or smart buildings that seal their vents minutes before a dust plume hits.
If those visions materialize, the next generation might breathe a little easier, not just in Abu Dhabi, but anywhere the wind carries particulate matter.
Shaoan Xie and Lingjing Kong presented research at ICML that could make it possible to create and edit.....
Kane Lindsay has enjoyed a gap year like no other — combining his passions for gardening, AI,.....
MBZUAI's Akashah Shabbir presented GeoPixel at ICML: a multimodal model that supports the task of pixel grounding.....