SynWeather: Weather Observation Data Synthesis Across Multiple Regions and Variables via a General Diffusion Transformer

Authors

  • Kaiyi Xu University of Science and Technology of China Shanghai Artificial Intelligence Laboratory
  • Junchao Gong Shanghai Jiaotong University Shanghai Artificial Intelligence Laboratory
  • Zhiwang Zhou Tongji University Shanghai Artificial Intelligence Laboratory
  • Zhangrui Li Nanjing University Shanghai Artificial Intelligence Laboratory
  • Yuandong Pu Shanghai Jiaotong University Shanghai Artificial Intelligence Laboratory
  • Yihao Liu Shanghai Artificial Intelligence Laboratory
  • Ben Fei The Chinese University of Hong Kong Shanghai Artificial Intelligence Laboratory
  • Fenghua Ling Shanghai Artificial Intelligence Laboratory
  • Wenlong Zhang Shanghai Artificial Intelligence Laboratory
  • Lei Bai Shanghai Artificial Intelligence Laboratory

DOI:

https://doi.org/10.1609/aaai.v40i2.37108

Abstract

With the advancement of meteorological instruments, abundant data has become available. However, due to instruments’ intrinsic limitations such as environmental sensitivity and orbital constraints, raw data often suffer from temporal or spatial gaps, making it urgent to leverage data synthesis techniques to fill in missing information. Current approaches are typically focus on single-variable, single-region tasks and primarily rely on deterministic modeling. This limits unified synthesis across variables and regions, overlooks cross-variable complementarity and often leads to over-smoothed results. To address above challenges, we introduce SynWeather, the first dataset designed for Unified Multi-region and Multi-variable Weather Observation Data Synthesis. SynWeather covers four representative regions: the Continental United States, Europe, East Asia, and Tropical Cyclone regions, as well as provides high-resolution observations of key weather variables, including Composite Radar Reflectivity, Hourly Precipitation, Visible Light, and Microwave Brightness Temperature. In addition, we introduce SynWeatherDiff, a general and probabilistic weather synthesis model built upon the Diffusion Transformer framework to address the over-smoothed problem. Experiments on the SynWeather dataset demonstrate the effectiveness of our network compared with both task-specific and general models. Moreover, SynWeatherDiff is able to generate results that are both fine-grained and accurate in high-value regions. Through the dataset and baseline model, we aim to advance meteorological downstream tasks and promote the development of general models for weather variable synthesis.

Published

2026-03-14

How to Cite

Xu, K., Gong, J., Zhou, Z., Li, Z., Pu, Y., Liu, Y., … Bai, L. (2026). SynWeather: Weather Observation Data Synthesis Across Multiple Regions and Variables via a General Diffusion Transformer. Proceedings of the AAAI Conference on Artificial Intelligence, 40(2), 1346–1354. https://doi.org/10.1609/aaai.v40i2.37108

Issue

Section

AAAI Technical Track on Application Domains II