I work at Mila supervised by Yoshua Bengio. I'm currently the Scientific Lead of the first International AI Safety Report, a project backed by 33 nations and intergovernmental organizations.

My research in machine learning covers risk management, LLM honesty, health applications, and data selection for large-scale deep learning. Across these areas, my publications as lead author have been covered by TV and newspapers like The Guardian, Time, etc, while others have been discussed by ministers or incorporated into national legislation.
‍
Before joining Mila, I did my PhD in ML at the University of Oxford under Yarin Gal funded by Google DeepMind, and I worked on learning human preferences and game-theoretical machine learning with David Duvenaud and Roger Grosse at Toronto’s Vector Institute and UC Berkeley, and with the Centre for the Governance of AI at Oxford. I studied machine learning (UCL), maths (Amsterdam) and Future Planet Studies (Amsterdam).

Contact me

soeren.mindermann ατ gmail.com

Publications as first author

*equal contribution to first authorship
‍

International AI Safety Report

Yoshua Bengio (Chair), Sören Mindermann (Scientific Lead), Daniel Privitera (Lead Writer), Tamay Besiroglu, Rishi Bommasani, Stephen Casper, Yejin Choi, Philip Fox, Ben Garfinkel, Danielle Goldfarb, Hoda Heidari, Anson Ho, Sayash Kapoor, Leila Khalatbari, Shayne Longpre, Sam Manning, Vasilios Mavroudis, Mantas Mazeika, Julian Michael, Jessica Newman, Kwan Yee Ng, Chinasa T. Okolo, Deborah Raji, Girish Sastry, Elizabeth Seger, Theodora Skeadas, Tobin South, Emma Strubell, Florian Tramèr, Lucia Velasco, Nicole Wheeler, Daron Acemoglu, Olubayo Adekanmbi, David Dalrymple, Thomas G. Dietterich, Edward W. Felten, Pascale Fung, Pierre-Olivier Gourinchas, Fredrik Heintz, Geoffrey Hinton, Nick Jennings, Andreas Krause, Susan Leavy, Percy Liang, Teresa Ludermir, Vidushi Marda, Helen Margetts, John McDermid, Jane Munga, Arvind Narayanan, Alondra Nelson, Clara Neppel, Alice Oh, Gopal Ramchurn, Stuart Russell, Marietje Schaake, Bernhard Schölkopf, Dawn Song, Alvaro Soto, Lee Tiedrich, Gaël Varoquaux, Andrew Yao, Ya-Qin Zhang, Fahad Albalawi, Marwan Alserkal, Olubunmi Ajala, Guillaume Avrin, Christian Busch, André Carlos Ponce de Leon Ferreira de Carvalho, Bronwyn Fox, Amandeep Singh Gill, Ahmet Halit Hatip, Juha Heikkilä, Gill Jolly, Ziv Katzir, Hiroaki Kitano, Antonio Krüger, Chris Johnson, Saif M. Khan, Kyoung Mu Lee, Dominic Vincent Ligot, Oleksii Molchanovskyi, Andrea Monti, Nusu Mwamanzi, Mona Nemer, Nuria Oliver, José Ramón López Portillo, Balaraman Ravindran, Raquel Pezoa Rivera, Hammam Riza, Crystal Rugege, Ciarán Seoighe, Jerry Sheehan, Haroon Sheikh, Denise Wong, Yi Zeng
2025

International Scientific Report on the Safety of Advanced AI (interim report)

Yoshua Bengio (Chair), Sören Mindermann (Scientific Lead), Daniel Privitera (Lead Writer), Tamay Besiroglu, Rishi Bommasani, Stephen Casper, Yejin Choi, Philip Fox, Ben Garfinkel, Danielle Goldfarb, Hoda Heidari, Leila Khalatbari, Shayne Longpre, Vasilios Mavroudis, Mantas Mazeika, Kwan Yee Ng, Chinasa T. Okolo, Deborah Raji, Theodora Skeadas, Florian Tramèr, Olubayo Adekanmbi, David Dalrymple, Thomas G. Dietterich, Edward W. Felten, Pascale Fung, Pierre-Olivier Gourinchas, Fredrik Heintz, Nick Jennings, Andreas Krause, Percy Liang, Teresa Ludermir, Vidushi Marda, Helen Margetts, John McDermid, Arvind Narayanan, Alondra Nelson, Alice Oh, Gopal Ramchurn, Stuart Russell, Marietje Schaake, Dawn Song, Alvaro Soto, Lee Tiedrich, Gaël Varoquaux, Andrew Yao, Ya-Qin Zhang, Fahad Albalawi, Marwan Alserkal, Olubunmi Ajala, Guillaume Avrin, Christian Busch, André Carlos Ponce de Leon Ferreira de Carvalho, Bronwyn Fox, Amandeep Singh Gill, Ahmet Halit Hatip, Juha Heikkilä, Gill Jolly, Ziv Katzir, Hiroaki Kitano, Antonio Krüger, Chris Johnson, Saif M. Khan, Kyoung Mu Lee, Dominic Vincent Ligot, Oleksii Molchanovskyi, Andrea Monti, Nusu Mwamanzi, Mona Nemer, Nuria Oliver, José Ramón López Portillo, Balaraman Ravindran, Raquel Pezoa Rivera, Hammam Riza, Crystal Rugege, Ciarán Seoighe, Jerry Sheehan, Haroon Sheikh, Denise Wong, Yi Zeng
2024

Inferring the effectiveness of government interventions against COVID-19

JM Brauner*, S Mindermann*, M Sharma*, D Johnston, J Salvatier, ...
Science, 2021

Prioritized training on points that are learnable, worth learning, and not yet learned

Sören Mindermann*, Muhammed Razzak*, Winnie Xu*, Andreas Kirsch, Mrinank Sharma, Aidan Gomez, Sebastian Farquhar, Jan Brauner, Yarin Gal
ICML, 2022

Occam's razor is insufficient to infer the preferences of irrational agents

S Mindermann*, S Armstrong*
NeurIPS, 2018

Understanding the effectiveness of government interventions in Europe’s second wave of COVID-19

M Sharma*, S Mindermann*, C Rogers-Smith, G Leech, B Snodin, J Ahuja, ...
Nature Communications, 2021

Active Inverse Reward Design

S Mindermann*, R Shah*, A Gleave, D Hadfield-Menell
ICML workshop Goals in Reinforcement Learning, 2018

Identifying Causal-Effect Inference Failure with Uncertainty-Aware Models

A Jesson*, S Mindermann*, U Shalit, Y Gal
NeurIPS, 2020

Changing composition of SARS-CoV-2 lineages and rise of Delta variant in England

S Mishra*, S Mindermann*, M Sharma*, C Whittaker*, T Mellan, T Wilton, ...
The Lancet: EClinicalMedicine, 2021

How Robust are the Estimated Effects of Nonpharmaceutical Interventions against COVID-19?

M Sharma*, S Mindermann*, J Brauner*, G Leech, A Stephenson, ...
NeurIPS (Spotlight talk), 2020

Publications as senior author

*equal contribution to senior authorship
‍

Managing extreme AI Risks amid rapid progress

Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, Gillian Hadfield, Jeff Clune, Tegan Maharaj, Frank Hutter, Atılım Güneş Baydin, Sheila McIlraith, Qiqi Gao, Ashwin Acharya, David Krueger, Anca Dragan, Philip Torr, Stuart Russell, Daniel Kahneman, Jan Brauner*, Sören Mindermann*
‍(I'm not as 'senior' as the other authors here ;) but this is based on leading the project)
Science, 2024

Mask wearing in community settings reduces SARS-CoV-2 transmission

G Leech, C Rogers-Smith, J Sandbrink, B Snodin, R Zinkov, B Rader, J Brownstein, Y Gal, S Bhatt*, M Sharma*, S Mindermann*, J Brauner*, L Aitchison*
Proceedings of the National Academy of Sciences (PNAS), 2022

A dataset of non-pharmaceutical interventions on SARS-CoV-2 in Europe

G Altman, J Ahuja, JT Monrad, G Dhaliwal, C Rogers-Smith, G Leech, B Snodin, JB Sandbrink, L Finnveden, AJ Norman, SB Oehm, JF Sandkühler, J Kulveit, S Flaxman, Y Gal, S Mishra, S Bhatt, M Sharma*, S Mindermann*, J Brauner*
Nature Scientific Data, 2022

Publications as co-author

Agentic Misalignment: How LLMs Could be an Insider Threat

A Lynch, B Wright, C Larson, KK Troy, SJ Ritchie, S Mindermann, E Perez, E Hubinger
Anthropic research, 2025

The Alignment Problem from a Deep Learning Perspective

R Ngo, L Chan, S Mindermann
International Conference on Learning Representations, 2024

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Evan Hubinger, Carson Denison, (many others) ... Sören Mindermann, Ryan Greenblatt, Buck Shlegeris, Nicholas Schiefer, Ethan Perez
Arxiv, 2024

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

L Pacchiardi, AJ Chan, S Mindermann, I Moscovitz, AY Pan, Y Gal, O Evans, J Brauner
International Conference on Learning Representations, 2024

Specific versus general principles for constitutional AI

S Kundu, Y Bai, (many others) ... S Mindermann, N Joseph, S McCandlish, J Kaplan
Arxiv, 2024

Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding

A Jesson, S Mindermann, Y Gal, U Shalit
International Conference on Machine Learning, 2021

Is the cure really worse than the disease? The health impacts of lockdowns during COVID-19

G Meyerowitz-Katz, S Bhatt, O Ratmann, JM Brauner, S Flaxman, S Mishra, M Sharma, S Mindermann, V Bradley, M Vollmer, L Merone, G Yamey
BMJ Global Health, 2021

Seasonal variation in SARS-CoV-2 transmission in temperate climates: A Bayesian modelling study in 143 European regions

Tomáš Gavenčiak, Joshua Teperowski Monrad, Gavin Leech, Mrinank Sharma, Sören Mindermann, Jan Marus Brauner, Samir Bhatt, Jan Kulveit
PLOS Computational Biology, 2022
‍

Policy impact

The International AI Safety Report was part of the official program of the 2025 AI Action Summit hosted by the French government. Numerous policymakers read the report or were briefed on it.
The International AI Safety Report formed the main foundation for the California Report on Frontier AI Policy, commissioned by governor Newsom.
The UK's new minister of Science, Innovation and Technology (Peter Kyle) cited the International AI Safety Report (2025) in his speech at the Munich Security Conference.
The UK's minister of Science, Innovation and Technology (Michelle Donelan) presented the International AI safety (interim) report (2024) to ministers of other countries and to other high-level representatives at the Seoul AI Summit. I act as the scientific lead of this report.
Lear-author paper on AI risk management was presented to Germany's minister of health who shared it with Germany's head of state Olaf Scholz.
Preprint cited in Germany's major federal bill that decided the national lockdown in force as of May 2021 (along with two other cited papers)
My papers on COVID-19 have been presented at the House of Representatives of the Netherlands, the WHO, the modeling groups of the Africa CDC (where I presented) and the UK’s Scientific Advisory Group for Emergencies (SAGE).
I presented work on mask-wearing at the UK Cabinet Office to support the UK's plan for fall 2021.

TV and newspaper interviews

Transformer News interview about new evidence for misalignment, 2025.
Analytics India Magazine interview about this AI safety paper, 2023.
Monitor TV magazine on ARD (German equiv. of BBC, ~3m viewers per episode), 2021. Talked about paper covering COVID’s 2nd wave. Interview also covered on Tagesschau.de and others.
Süddeutsche Zeitung, 2021. Interview about government interventions in COVID’s second wave.
NRC Handelsblad, 2021. Interview about government interventions in COVID’s first wave.
Turing Institute Podcast, 2021. Talked about this paper.
ITV Peston, the flagship political discussion programme of ITV, 2021 (paywalled). Talked about this paper.
DW News, 2021. Talked about this paper.

Invited talks

“International AI Safety Report”, London initiative for Safe AI, 2025
“International AI Safety Report”, Vector Institute, 2025
“The Alignment Problem from a Deep Learning Perspective”, IBM Research, 2024
“The Alignment Problem from a Deep Learning Perspective”, UC Berkeley (CHAI group), 2023
“The Alignment Problem from a Deep Learning Perspective”, Torr Vision Group, 2023
“The Alignment Problem from a Deep Learning Perspective”, University of Amsterdam, 2023
“The Alignment Problem from a Deep Learning Perspective”, Future of Life Institute, 2023
“The Alignment Problem from a Deep Learning Perspective”, University of Edinburgh, 2023
“Prioritized training on points that are learnable, worth learning, and not yet learned”, Meta AI Research, 2022
“Prioritized training on points that are learnable, worth learning, and not yet learned”, USC (Cutelab), 20
“Prioritized training on points that are learnable, worth learning, and not yet learned”, Cohere.ai, 2021
“Identifying Causal Effect Inference Failure with Uncertainty-Aware Models”, Vilnius Machine Learning Workshop, 2021
“Effectiveness of government interventions against COVID-19”, University of Oxford—AI for Agent-Based Modeling seminar, 2022
“Effectiveness of government interventions against COVID-19”, Africa CDC modelling consortium, 2020
“Effectiveness of government interventions against COVID-19”, German Centre for Infection Research
“Government interventions in the second wave”, ETH Zürich, 2021
“Government interventions in the second wave”, MRC Centre for Global Infectious Disease Analysis, Imperial College, 2021
“How Robust are the Estimated Effects of Nonpharmaceutical Interventions against COVID-19?”, NeurIPS Spotlight, 2020
“How Robust are the Estimated Effects of Nonpharmaceutical Interventions against COVID-19?”, NeurIPS COVID Symposium Spotlight, 2020