- This event has passed.
SRI Seminar Series: Saffron Huang, “Beyond the benchmark”

Our weekly SRI Seminar Series welcomes Saffron Huang, a technologist, researcher, and writer whose work bridges artificial intelligence, democratic governance, and the societal structures that shape how technologies evolve. A research scientist on the Societal Impacts team at Anthropic and co-founder of the Collective Intelligence Project, Huang examines how AI systems influence human behaviour, how institutional choices direct technological development, and what it takes to build systems that genuinely reflect collective values.
Talk title
“Beyond the benchmark”
Abstract
The current canonical approach to evaluating AI systems uses static and often single-turn benchmarks that fail to capture how people actually use and are affected by AI in practice. As AI becomes embedded in daily work and life, this disconnect between evaluation methods and real-world use becomes increasingly consequential. Our standard benchmarks can’t capture impacts like cognitive overreliance on AI—even though overreliance is a major area of concern among the public (as the non-profit I co-founded, the Collective Intelligence Project (CIP) discovered in 2023, when we asked a representative sample of 1000 Americans what their top AI risk was).
We’ve been searching for our keys only under the streetlight. Many researchers have recognized the gap between academic benchmarks and the role we want them to play in forecasting AI’s effects, but few have built the methods to address it.
This talk motivates the theoretical need for AI measurement instruments that can go beyond static evaluations. It then presents a series of concrete tools that my colleagues and I on Anthropic’s Societal Impacts team have developed to better study AI “in the wild”—and what those tools have revealed. These include Clio, a method for privacy-preserving analysis of millions of real conversations, and Anthropic Interviewer, which uses AI to conduct qualitative interviews at scale. Together, these tools have enabled new discoveries: a taxonomy of 3,307 values that AI systems express in context; an index tracking economic activity performed with Claude; and detailed accounts of how AI is reshaping work for engineers, scientists, and creative professionals. A throughline across this research is that we can use AI to augment how we do science, including to study AI’s impacts (very meta!).
I will assert that building better instruments for understanding AI’s societal effects is a prerequisite for meaningful governance. If we can’t see what’s happening, we can’t respond to it.
Moderator: Anna Su, Faculty of Law
Registration
To register for the event, visit the official event page.
About Saffon Huang
Saffron Huang is a technologist, researcher, and writer whose work sits at the intersection of artificial intelligence, democratic governance, and societal well-being. She is a research scientist on the Societal Impacts team at Anthropic, where she examines how AI systems behave, how they can be made more transparent, and how their development can better reflect collective values. As the co-founder of the Collective Intelligence Project, she has helped pioneer efforts to make technology and AI development more democratic, exploring new institutional models that broaden public participation in key governance decisions.
Her writing and research focus on understanding technology as a set of contingent choices—often shaped by undemocratic processes—that exert profound influence on society. Through this work, she investigates how to design AI systems and institutions that augment human capabilities, empower communities, and support healthier shared futures. Across her roles, she is driven by a central question: how can we steer AI to help us live together better and bring out the best in humanity?
About the SRI Seminar Series
The SRI Seminar Series brings together the Schwartz Reisman community and beyond for a robust exchange of ideas that advance scholarship at the intersection of technology and society. Seminars are led by a leading or emerging scholar and feature extensive discussion.
Each week, a featured speaker will present for 45 minutes, followed by an open discussion. Registered attendees will be emailed a Zoom link before the event begins. The event will be recorded and posted online.