Frugalité et apprentissage machine
Quand ? 
Le 11/09/2023, de 09:00 à 18:30 

Où ?  ENS site Descartes salle D8 001 / D8 BUISSON 
S'adresser à  VAYER Titouan 
Ajouter un événement au calendrier 
vCal iCal 
Les méthodes d'intelligence artificielle, ou d'apprentissage machine, obtiennent aujourd'hui des performances spectaculaires dans de nombreux domaines. Cependant, l'entraînement des modèles sousjacents s'appuie sur une consommation importante de ressources informatiques; et donc énergétiques. Pour les modèles les plus performants, ces ressources sont loin d'être négligeables.
Il devient donc important d'aller vers plus de "frugalité" et d'être capable de construire des modèles d'apprentissage sous des contraintes de ressources. Cette journée a pour but de discuter de la faisabilité de cet objectif, tant d'un point de vue technique que sociétal: comment construire des modèles avec peu de ressources tout en maintenant de bonnes performances ? et estce possible de dépasser certaines limites telles que celles imposées par l'effet rebond ?
Programme

9h  9h30: Accueil café + mots d'introduction

Pablo Jensen (9h30  10h15)
Deep Earnings : the surprising link between neural networks and neoliberalism
Frank Rosenblatt, who invented the Perceptron in 1958, cited, as a major source of his inspiration, an obscure book by economist Friedrich von Hayek ("The sensory order"). Since Hayek is widely known as the leading ideologue of neoliberalism, one wonders what the connection could be between neural networks and free markets. Do they share a common vision of society and if so, what is it? My presentation will show why « AI » is the latest tool in the modernist approach, at the root of the sociotechnical networks that are depleting the planet. 
Angela Bonifati (10h15  11h)
Towards Qualitydriven and AIassisted Data ScienceOne of the key processes of data science pipelines is data preparation, which aims at cleaning and curating the data for the subsequent analytical and inference steps. Data curation deals with the errors and conflicts introduced into the input datasets during data collection and acquisition. These errors, such as violations of business rules, typos, missing values, replicated entries and abnormal features, are of different kinds depending on the nature of the data, ranging from structured data to graphshaped data and time series. If these errors are kept in the data, they can propagate to the results of the data science pipelines and also hamper the efficiency and the trustworthiness of the underlying processes.
I will present our latest results on enhancing the quality of querying and inference processes on different kinds of heterogeneous data. Among the others, we operate on reallife healthcare from several hospitals in France and EU and provide the domain experts with useful AIassisted data management techniques that can help them with their diagnoses and analyses. First, inconsistencyaware annotations can quantify the amount of quality for structured data input to analytical processes. These annotations are further exploited during query processing in order to enhance the output of queries with inconsistency degrees. Second, featurebased similarities among time series corresponding to patients’ signals help to better identify groups of patients and to assess their risks for a particular disease. Third, violations of graph constraints can be addressed by humanguided feedback and lead to better accuracy of the repairing algorithms for graphshaped data. 
Pause 15min

Claire Boyer (11h15  12h00):
Some statistical insights into PINNs
Physicsinformed neural networks (PINNs) combine the expressiveness of neural networks with the interpretability of physical modeling. Their good practical performance has been demonstrated both in the context of solving partial differential equations and in the context of hybrid modeling, which consists of combining an imperfect physical model with noisy observations. However, most of their theoretical properties remain to be established. We offer some statistical guidelines into the proper use of PINNs. 
Déjeuner (12h  14h)

Victor Court (14h  14h45)
Energy efficiency, rebound effect, and frugality
Abstract: Many institutions (IEA, OECD, IPCC, etc.) consider energy efficiency to be the primary lever for mitigating climate change. In theory, energy efficiency measures reduce energy consumption levels, and therefore greenhouse gas emissions, without forgoing the services provided by energy. In practice, however, numerous rebound effects cancel out some of the potential reductions in energy consumption. The presentation will detail some of these mechanisms, particularly in the case of digital uses. It will also discuss the difficulty of representing these phenomena in integrated assessment models (IAMs) used to produce economic and energy projections. 
Benjamin Guedj (14h45  15h30)
On generalisation and learning: towards principled frugal AI
I will present a short and selfcontained overview of generalisation theory (with a particular focus on PACBayes theory) and highlight how this can lead to principled AI systems which are more frugal in terms of data and compute ressources. If time allows I will illustrate the above with a few recent works from my group in London.
References: https://bguedj.github.io/publications/ 
Pause 15min

Laurent Jacques (15h45  16h30)
Rankone projection models in optics: from lensless interferometry to optical sketching
Many fields of science and technology face an evergrowing accumulation of “data”, such as signals, images, video, or biomedical data volumes. As a result, many techniques and algorithms, such as principal component analysis, clustering or random projection methods, have been devised to summarize these objects in reduced representations while preserving key information in this compression. In other contexts, it is the physics of the considered application that imposes us to indirectly observe an object of interest (such as biological cells, human brains or black holes) through such distorted representations while still being able to correctly image this object.
In this presentation, we will review two representative applications of these two facets of data reduction models, namely interferometric lensless imaging (LI) with a multicore fiber (MCF), and quadratic random sketching with an optical processing unit (OPU). Their common feature lies in their mathematical models; they both rely on a specific data reduction called rankone projection (ROP). Mathematically, a ROP model appears as soon as a structured data matrix—carrying information about a given object or data—is summarized by a series of joint left and right multiplication with a set of sketching vectors. A decade ago, several researchers showed that, under certain conditions, one can recover a structured matrix from ROP observations if their number exceeds the intrinsic complexity of this matrix.
We will see the specific features of each of these two applications. In the case of MCFLI, we will explain how the physics of wave propagation defines a twostep sensing model: the combination of an interferometric matrix—encoding the spectral content of the object image—and a ROP observation model (induced by the complex amplitudes of each MCF cores). Regarding the second application, we will first discover the working principle of an OPU and in particular its correspondence with quadratic random projection of a data stream, that is, a ROP model applied to a specific lifting of the data in a higherdimensional domain. We will then show how to perform simple signal processing tasks (such as deducing local variations in an image) directly on the OPU sketches, with proofofconcept experiments defined over naive data classification methods.
This is a joint work with Olivier Leblanc (UCLouvain, Belgium), Mathias Hofer (Institut Fresnel, France), Siddharth Sivankutty (U. Lille, France), and Hervé Rigneault (Institut Fresnel, France), and Rémi Delogne (UCLouvain, Belgium), Vincent Schellekens (imec, Belgium), and Laurent Daudet (LightOn, France).