- This event has passed.
SRI Seminar Series: David Duvenaud, “The big picture of LLM dangerous capability evals”

Our weekly SRI Seminar Series welcomes David Duvenaud, an associate professor in the Department of Computer Science, for a special in-person presentation jointly presented as part of the Department of Computer Science’s C.C. “Kelly” Gotlieb Distinguished Lecture Series, supported in part by a gift from the Webster Family Charitable Giving Foundation. A founding faculty member and Canada CIFAR AI Chair at the Vector Institute, Duvenaud is widely recognized for his contributions to AI safety, probabilistic deep learning, and generative modeling.
Duvenaud’s current research focuses on assessing dangerous capabilities in frontier AI models, mitigating catastrophic risks, and developing institutional frameworks for post-AGI futures. In this talk, he will give an overview of his recent research conducted while part of Anthropic’s Alignment Science Team evaluating risks from advanced models, and how to develop more robust methods for AI alignment with human institutions and interests.
Moderator: Sheila McIlraith, Department of Computer Science
Talk title
“The big picture of LLM dangerous capability evals”
Abstract
How can we avoid AI disasters? The plan so far is mostly to check the extent to which AIs could cause catastrophic harms based on tests in controlled conditions. However, there are obvious problems with this approach, both technical and due to their limited scope. I’ll give an overview of the work my team at Anthropic did to evaluate risks due to models feigning incompetence, colluding, or sabotaging human decision-making. I’ll also discuss the idea of “control” techniques, which use AIs to monitor and set traps to look for bad behavior in other AIs. Finally, I’ll outline the main problems beyond the scope of these approaches, in particular that of robustly aligning our institutions to human interests.
Registration
The in-person session of this event is now sold out; virtual registration remains open. If you have not registered and would like to attend in person, please join the waitlist via Eventbrite; unclaimed seats will be released on a first-come, first-served basis shortly before the talk begins.
Venue:
Schwartz Reisman Innovation Campus, University of Toronto, Room W240 (second floor)
108 College Street, Toronto, ON M5G 0C6
Seminar will be broadcast live via Zoom (register for link).
This special in-person session is co-presented with the University of Toronto’s Department of Computer Science’s C.C. “Kelly” Gotlieb Distinguished Lecture Series, which has welcomed top minds to U of T to talk about key issues in computer science for more than a decade, and supported in part by a gift from the Webster Family Charitable Giving Foundation.
About David Duvenaud
David Duvenaud is an associate professor in the Department of Computer Science and Statistical Sciences at the University of Toronto, where he holds a Schwartz Reisman Chair in Technology and Society. A leading voice in AI safety and artificial general intelligence (AGI) governance, Duvenaud’s current work focuses on evaluating dangerous capabilities in advanced AI systems, mitigating catastrophic risks from future models, and developing institutional designs for post-AGI futures. Duvenaud is a Canada CIFAR AI Chair and a founding faculty member at the Vector Institute, a member of Innovation, Science and Economic Development Canada’s Safe and Secure AI Advisory Group, and recently completed an extended sabbatical with the Alignment Science team at Anthropic.
Duvenaud’s early work helped shape the field of probabilistic deep learning, with contributions including neural ordinary differential equations, gradient-based hyperparameter optimization, and generative models for molecular design. He has received numerous honors, including the Sloan Research Fellowship, Ontario Early Researcher Award, and best paper awards at NeurIPS, ICML, and ICFP. Before joining the University of Toronto, Duvenaud was a postdoctoral fellow in the Harvard Intelligent Probabilistic Systems group and completed his PhD at the University of Cambridge under Carl Rasmussen and Zoubin Ghahramani.
About the SRI Seminar Series
The SRI Seminar Series brings together the Schwartz Reisman community and beyond for a robust exchange of ideas that advance scholarship at the intersection of technology and society. Seminars are led by a leading or emerging scholar and feature extensive discussion.
Each week, a featured speaker will present for 45 minutes, followed by an open discussion. Registered attendees will be emailed a Zoom link before the event begins. The event will be recorded and posted online.