SRI Seminar Series: Owain Evans, “Truthful language models and AI alignment”
In this talk, Evans will present recent work on defining and measuring “truthfulness” in the context of large language models, including their calibration, and their ability to forecast world events. These topics will be considered in relation to the reduction of epistemic harms from AI and the problem of value alignment in the context of artificial general intelligence.