Intro
Hello, welcome to my website. I am a second-year PhD student in Interpretability for Natural Language Processing models. I do my PhD between the IRT Saint Exupéry and the IRIT, both in Toulouse France. I am under the supervision of Pr. Nicholas Asher, Pr. Philippe Muller, and Dr. Fanny Jourdan.
Core maintainer of the Interpreto and Xplique open-source libraries. My goal is to provide easy access to useful explanations.
Research
The goal of my PhD is to build an interpretability agent (conversational explainability) for language models. However, I find that the quality and consistency of explanations are lacking, in particular with concept-based explanations.
Which is why I currently work on improving such explanations. Notably, the first publication of my PhD, ConSim (Poché et al. ACL 2025), aims at evaluating the usefulness of concept-based explanations in NLP.
The future directions I would like to explore are:
- Improving the interpretation of concept-based explanations.
- Building contrastive concept-based explanations.
- Example-based explanations for language models.
If these subjects are of interest to you, feel free to contact me, I would be happy to collaborate.
News
- 2026 January ✔️ Interpreto accepted at Explain'AI workshop at EGC Anglet 2026
- 2025 November 🎖️ Audience Choice Award for Interpreto poster at Mobilit.AI 2025
- 2025 October 🔊 Present Interpreto at XAI4U workshop at IHM 2025
- 2025 July 📰 ConSim got accepted at ACL 2025
- 2025 June 🔊 Give tutorial on Explainability for NLP at PFIA Dijon 2025
- 2025 April 🔊 Talk on Explainability for energy applications at ESIA seasonal school 2025
- 2024 October 🚀 Started PhD
- 2024 September 🔊 Talk on Explainability at IA Pau
- 2024 January 🔊 Present Xplique at Explain'AI workshop at EGC Dijon 2024
- 2023 August 📰 Example-based survey accepted at xAI 2023 Lisbon
- 2023 📰 Publish guidelines for explainable AI with the DEEL project
- 2021 June 🚀 IRT Saint Exupéry - Start as Research Engineer in Explainable AI
- 2021 🔊 Defend Master's Thesis on Explainable Reinforcement Learning
- 2021 🎓 ENSIMAG Master's Degree in Applied Mathematics and Computer Science
- 2021 🎓 Politecnico di Torino Master's Degree in Computer Engineering and Data Science
Software
🪄 Interpreto: An Explainability Library for Transformers
Interpreto is a Python library for post hoc explainability of text HuggingFace models, from early BERT variants to LLMs. It provides two complementary families of methods: attributions and concept-based explanations. The library connects recent research to practical tooling for data scientists, aiming to make explanations accessible to end users. It includes documentation, examples, and tutorials. Interpreto supports both classification and generation models through a unified API. A key differentiator is its concept-based functionality, which goes beyond feature-level attributions and is uncommon in existing libraries.
Xplique: Explainability Toolbox for Neural Networks
Xplique (pronounced \ɛks.plik\) is a Python toolkit dedicated to explainability. The goal of this library is to gather the state of the art of Explainable AI to help you understand your complex neural network models. Originally built for Tensorflow's model it also works for PyTorch models partially. The library is composed of several modules, the Attributions Methods module implements various methods (e.g Saliency, Grad-CAM, Integrated-Gradients...), with explanations, examples and links to official papers. The Feature Visualization module allows to see how neural networks build their understanding of images by finding inputs that maximize neurons, channels, layers or compositions of these elements. The Concepts module allows you to extract human concepts from a model and to test their usefulness with respect to a class. Finally, the Metrics module covers the current metrics used in explainability. Used in conjunction with the Attribution Methods module, it allows you to test the different methods or evaluate the explanations of a model.
Publications
ConSim: Measuring Concept-Based Explanations’ Effectiveness with Automated Simulatability
ConSim is a metric for concept-based explanations based on simulatability and user-llms. It shows consistent methods ranking across datasets, models, and user-llms. Furthermore, it correlates with faithfulness and complexity.
Posters
Teaching
Placeholder teaching section.