Machine Learning and sampling methods for climate and physics
Cette journée d’étude "Machine Learning and sampling methods for climate and physics" est organisée par l’IXXI et par le Laboratoire de Physique de l'ENS de Lyon.
4 avril 2022 - de 13h30 à 18h30, Salle des thèses, ENS Lyon, Campus Monod
Cette journée d’étude est organisée par l’IXXI et par la Laboratoire de Physique de l'ENS de Lyon
Elle a pour objet de rassembler dans un même lieu des équipes de recherches intéressées par la mise en oeuvre de
méthodes avancées de « Machine Learning » pour les applications de la physique et du climat.
Cette journée s’articulera autour de communications proposées par :
- Manon Michel, "Recent developments in sampling algorithms" Université Clermont-Auvergne (http://manon-michel.perso.math.cnrs.fr) - slides / video (ixxi) / video (you-tube)
- Ronan Fablet, "End-to-end and physics-informed learning for dynamical systems", IMT Atlantique, Lab-STICC, AI Chair OceaniX (https://rfablet.github.io) - slides / video (ixxi) / video (you-tube)
- Tom Beucler, "Atmospheric Physics-Guided Machine Learning: Towards Physically-Consistent, Data-Driven, and Interpretable Models of Convection" Université de Lausanne (CH) (https://www.unil.ch/gse/fr/home/menuinst/vie-facultaire/promotions--nominations/beucler-tom.html) - slides / video (ixxi) / video (you-tube)
- Davide Faranda, "When machine learning deciphers the 'language' of atmospheric air masses" IPSL (https://www.lsce.ipsl.fr/Pisp/davide.faranda/), - slides / video (ixxi) / video (you-tube)
- George Miloshevich, Probabilistic forecasting of heat waves with deep learning ; ENS de Lyon, CNRS, Physics Laboratory - slides / video (ixxi) / video (you-tube)
suivies de séances de questions et de discussions.
Au plaisir de vous retrouver, dès 13:00 pour le café,
- 13h30 -14h10 Tom Beucler: "Atmospheric Physics-Guided Machine Learning: Towards Physically-Consistent, Data-Driven, and Interpretable Models of Convection" Université de Lausanne (CH)
- 14h10 -14h50 Manon Michel : "Recent developments in sampling algorithms" ; Université Clermont-Auvergne
- 14h50 -15h10 Pause
- 15h10 - 15h50 Davide Faranda : "When machine learning deciphers the 'language' of atmospheric air masses" IPSL
- 15h50 - 16h30 Ronan Fablet : "End-to-end and physics-informed learning for dynamical systems", IMT Atlantique, Lab-STICC, AI Chair OceaniX
- 16h30 - 16h50 Pause
- 16h50 - 17h30 George Miloshevich : "Probabilistic forecasting of heat waves with deep learning" , ENS de Lyon, CNRS, Physics Laboratory
Atmospheric Physics-Guided Machine Learning: Towards Physically-Consistent, Data-Driven, and Interpretable Models of Convection
Tom Beucler, Université de Lausanne (CH) (https://www.unil.ch/gse/fr/home/menuinst/vie-facultaire/promotions--nominations/beucler-tom.html)
Data-driven algorithms, in particular neural networks, can emulate the effect of unresolved processes in coarse-resolution climate models if trained on high-resolution simulation data. However, they may violate key physical constraints and make large errors when evaluated outside of their training set. I will share progress towards overcoming these two challenges in the case of machine learning the effect of subgrid-scale convection and clouds on the large-scale climate. First, physical constraints can be enforced in neural networks, either approximately by adapting the loss function or to within machine precision by adapting the architecture. Second, as these physical constraints are insufficient to guarantee generalizability, I additionally propose to physically rescale the inputs and outputs of machine learning algorithms to help them generalize to unseen climates. Overall, these results suggest that explicitly incorporating physical knowledge into data-driven models of climate processes may improve their consistency, stability, and ability to generalize across climate regimes.
Tom Beucler is an assistant professor of environmental data science at the University of Lausanne in Switzerland. He recently started a lab specifically dedicated to the intersection of atmospherics physics and machine learning, with the goal of improving our understanding of atmospheric dynamics and assisting weather and climate predictions. For that purpose, his research group combines physical theory, computational science, statistics, numerical simulations, and observational analyses. Before that, Tom studied the interaction of tropical storms, radiation, and atmospheric water as part of his PhD at MIT. As a postdoc and project scientist at Columbia and UC Irvine, he investigated how to best integrate physical knowledge into neural-network representations of convection for climate modeling, which will be the theme of today’s presentation.
Recent developments in sampling algorithms
Manon Michel, Université Clermont-Auvergne (http://manon-michel.perso.math.cnrs.fr)
When dealing with a statistical inference problem, the Bayesian approach replaces standard optimized point-estimates of the parameters by a full probability distribution. It allows to take into consideration a priori information on the problem and to quantify uncertainty, but at the cost of having to deal with high-dimensional integrals. Such integrals are then estimated through a sum over a collection of samples obtained by Monte Carlo (MC) methods. In this talk, I will present recent advances in MC methods, based on non-reversibility and factorization of the underlying stochastic processes, but also on the possibilities now offered by normalizing flows.
Since 2018, Manon Michel is a CNRS researcher based at Clermont-Auvergne University. Her research line mainly focuses on the design and implementation of stochastic algorithms for statistical physics and Bayesian inference. This algorithmic work is fueled by the study of complex systems and the analysis of the convergence and coupling of Markov chains.
"When machine learning deciphers the 'language' of atmospheric air masses"
Davide Faranda, IPSL (https://www.lsce.ipsl.fr/Pisp/davide.faranda/),
Latent Dirichlet Allocation (LDA) is capable of analyzing thousands of documents in a short time and highlighting important elements, recurrences and anomalies. It is generally used in linguistics to study natural language: its word analysis reveals the theme(s) of a document, each theme being identified by a specific vocabulary or, more precisely, by a particular statistical distribution of word frequency.
In the climatologists' use of LDA, the document is a daily weather map and the word is a pixel of the map. The theme with its corpus of words can become a cyclone or an anticyclone and, more generally, a 'pattern' that the scientists term motif. Artificial intelligence – a sort of incredibly fast robot meteorologist – looks for correlations both between different places on the same map, and between successive maps over time. In a sense, it 'notices' that a particular location is often correlated with another location, recurrently throughout the database, and this set of correlated locations constitutes a specific pattern.
The algorithm performs statistical analyses at two distinct levels: at the word or pixel level of the map, LDA defines a motif, by assigning a certain weight to each pixel, and thus defines the shape and position of the motif; LDA breaks down a daily weather map into all these motifs, each of which is assigned a certain weight.
In concrete terms, the basic data are the daily weather maps between 1948 and nowadays over the North Atlantic basin and Europe. LDA identifies a dozen or so spatially defined motifs, many of which are familiar meteorological patterns such as the Azores High, the Genoa Low or even the Scandinavian Blocking. A small combination of those motifs can then be used to describe all the maps. These motifs and the statistical analyses associated with them allow researchers to study weather phenomena such as extreme events, as well as longer-term climate trends, and possibly to understand their mechanisms in order to better predict them in the future.
The preprint of the study is available as:
Lucas Fery, Berengere Dubrulle, Berengere Podvin, Flavio Pons, Davide Faranda. Learning a weather dictionary of atmospheric patterns using Latent Dirichlet Allocation. 2021. ⟨hal-03258523⟩ https://hal-enpc.archives-ouvertes.fr/X-DEP-MECA/hal-03258523v1
End-to-end and physics-informed learning for dynamical systems
R. Fablet, IMT Atlantique, Lab-STICC, AI Chair OceaniX
Whereas model-driven approaches represent the state-of-the-art for the analysis, simulation and reconstruction of (geo)physical dynamical systems, learning-based and data-driven frameworks become relevant schemes for numerous scientific domains, including for the study of phenomena governed by physical laws. They offer new means to fully benefit from available observation and/or simulation data. In this context, making the most of model-driven and data-driven paradigms naturally arises as a key challenge, especically when dealing with partially-observed systems.
In this talk, I will explore these research avenues with a focus on end-to-end and physics-informed learning approaches with illustrations on ocean-related case-studies (e.g., space oceanography, movement ecology, maritime surveillance).
R. Fablet, B. Chapron, L. Drumetz, E. Memin, O. Pannekoucke. F. Rousseau. Learning Variational Data Assimilation Models and Solvers. JAMES, 2021.
R. Fablet et al. End-to-end physics-informed representation learning for satellite ocean remote sensing data: applications to satellite altimetry and sea surface currents. Proc. XXIV ISPRS Congress, 2021.
A. Roy, S. Lanco Bertrand, R. Fablet. Generative Adversarial Networks (GAN) for the simulation of central-place foraging trajectories. MEE, 2022.
Probabilistic forecasting of heat waves with deep learning
George Miloshevich, ENS de Lyon, CNRS, Physics Laboratory
Deep Neural Networks are rapidly growing foothold in Earth Sciences and elsewhere e.g. in surrogate modeling. These developments serve interests of both weather prediction and climate modeling. Among various difficulties stands reducing uncertainties in future climatologies and meteorological forecasting of extreme events such as heat waves and droughts. By nature, data is scarce for rare events, and so their study is a major challenge.
Convolutional neural network is trained on a climate model to predict heat waves. It is constructed with the goal of capturing global information coming from teleconnections of the geopotential with much more localized signal of soil moisture, which generally correlates with heat waves. The main question is to understand how much predictive information can be extracted from geophysical fields using this set-up and how this depends on the data size. Furthermore the issue of smoothness of the prediction along a given trajectory arises and is addressed with transfer learning.