Erwan

I am currently a PhD student at LAMSADE (Université Paris Dauphine-PSL, Paris, France) under the supervision of Alexandre Allauzen. I am part of the MILES team which is dedicated to machine learning.

My research is about frugal machine learning: I aim at making deep learning models more resource-efficient while maintaining their performance — the term "resource" referring for instance to time, memory, money, carbon emissions, etc. The PhD focuses mainly on transformers applied to NLP, as they are by far the most resource-consuming models in the field (hello ChatGPT). This is still a very broad topic, and multiple angles of attack are considered — from the architecture of the model to the optimization algorithm used to train it.

erwan.fagnou/AT/dauphine.psl.eu
LAMSADE, Université Paris Dauphine-PSL
Paris, France

News
Publications
Career & Education

News

Sept. 2024
Our paper "Chain and Causal Attention for Efficient Entity Tracking" has been accepted to EMNLP 2024 (main) in Miami! 🌴
Dec. 2023
Started my PhD at LAMSADE (Université Paris Dauphine-PSL) under the supervision of Alexandre Allauzen.

Publications

Nov. 2024
Chain and Causal Attention for Efficient Entity Tracking
Erwan Fagnou, Paul Caillon, Blaise Delattre, Alexandre Allauzen
EMNLP 2024 [pdf]
This paper investigates the limitations of transformers for entity-tracking tasks in large language models. We identify a theoretical constraint, showing that transformers require at least $\log_2 (n+1)$ layers to handle entity tracking with $n$ state changes. To address this issue, we propose an efficient and frugal enhancement to the standard attention mechanism, enabling it to manage long-term dependencies more efficiently. By considering attention as an adjacency matrix, our model can track entity states with a single layer. Empirical results demonstrate significant improvements in entity tracking datasets while keeping competitive performance on standard natural language modeling. Our modified attention allows us to achieve the same performance with drastically fewer layers. Additionally, our enhanced mechanism reveals structured internal representations of attention. Extensive experiments on both toy and complex datasets validate our approach. Our contributions include theoretical insights, an improved attention mechanism, and empirical validation.

PhD

Dec. 2023 - Today

PhD student in the MILES team of the LAMSADE lab, at Université Paris Dauphine-PSL, under the supervision of Alexandre Allauzen. It is founded by the PEPR SHARP (Sharp Theoretical and Algorithmic Principles for frugal Machine Learning), which involves several actors collaborating towards frugal machine learning.

Education

2022 - 2023

ENS Paris-Saclay
Master 2: Mathematics, Vision and Learning (MVA)
Selective Master's degree in machine learning, preparing the students for research. See the official website (in French), or here (in English).

2020 - 2023

Télécom Paris
Master of Science
- GPA: 4.0
- 3rd year: MVA Master at ENS Paris-Saclay
- 2nd year: SD (Data Science) and MITRO (Mathematics, Theoretical Computer Science and Operational Research) tracks