Probabilistic Data Assimilation

Research Theme

Probabilistic Data Assimilation for Climate Extremes

We design distribution-aware, physics-grounded data assimilation frameworks that learn how full probability distributions β€” not just single trajectories β€” evolve in a changing climate.

Generative ML Bayesian Inference Extremes & Tails
Probabilistic Data Assimilation

Why probabilistic data assimilation?

Traditional data assimilation in weather and climate science focuses on improving short-term state estimation β€” for example, updating a model trajectory with daily temperature observations.

But long-term climate questions demand something deeper:

We are rarely concerned with the exact state on a single future day β€” we care about how full probability distributions evolve, especially their tails that govern extreme events.

Our work introduces a distribution-aware assimilation framework built on Bayesian generative modeling to learn, update, and propagate the probability distributions of climate variables.


A framework for next-generation climate inference

πŸ“ˆ Parameter inference

Uncertainty-aware inference of climate parameters consistent with both physics and observational constraints.

🌑️ Extreme-event behavior

Tail-resolving characterization of extremes beyond Gaussian or linear assumptions.

πŸŒ€ Stable long-range projections

More stable climate projections under noisy, sparse, or biased observations.


What this framework enables

  • Probabilistic long-range climate estimation
  • Quantile-based risk and extreme-value analysis
  • Physics-informed generative surrogates for ensembles
  • Flexible coupling of ML models with dynamical systems

Representative Publications

  1. Li, S., Zheng, T., Farchi, A., Bocquet, M., & Gentine, P. (2025). Probabilistic data assimilation for ensemble distribution projections with generative machine learning: A Lorenz’96 proof-of-concept. Geophysical Research Letters. πŸ”— Read the paper
  2. Qu, Y., Nathaniel, J., Li, S., & Gentine, P. (2024). Deep generative data assimilation in multimodal setting. CVPR 2024. πŸ“„ PDF on CVF OpenAccess