Distinguished Lecture Series: Been Kim (Google DeepMind)- Alignment and interpretability: how we might get it right

Date & Time:

November 19, 2024 2:00 pm – 3:00 pm

Location:

Crerar 390, 5730 S. Ellis Ave., Chicago, IL,

11/19/2024 02:00 PM 11/19/2024 03:00 PM America/Chicago Distinguished Lecture Series: Been Kim (Google DeepMind)- Alignment and interpretability: how we might get it right Crerar 390, 5730 S. Ellis Ave., Chicago, IL,

Abstract: The main goal of interpretability is to enable communication between humans and machines, whether it’s a value, knowledge, or an objective. In this talk, I argue that a better way to enable this communication is for humans to expand what they know and learn new things. Doing so enables us to also expand what machines know—by building better-aligned machines. I share why considering the representational gap is crucial in solving the alignment problem, and I provide an example of bridging the knowledge gap.

Speakers

Been Kim

Senior Staff Research Scientist, Google DeepMind

Been Kim is a senior staff research scientist at Google DeepMind. Her research focuses on helping humans to communicate with complex machine learning models: 1) building tools to aid human’s collaboration with machines (and detect when those tools fail) 2) study machines’ general nature and 3) leveraging machines’ knowledge to benefit humans. She gave a talk at the G20 meeting in Argentina in 2019 and a keynote at ICLR 2022 and ECML 2020. Her work TCAV received UNESCO Netexplo award, was featured at Google I/O 19′. Her work is in a chapter of Brian Christian’s book on “The Alignment Problem”. She is the General chair at ICLR2024, was a Senior Program Chair at ICLR 2023 and advisory board at TRAILS. She has been a senior area chair at NeurIPS, ICML, ICLR, AISTATS and others for the past few years. She is a steering committee member of FAccT conference and SATML. She received her PhD. from MIT.

Resources

Community

Ph.D. Student Jibang Wu Receives the Stigler Center Ph.D. Dissertation Award for His Work Modeling the Incentive Structures of Reward and Recommendation–Based Systems

Rebecca Willett Receives the SIAM Activity Group on Data Science Career Prize

UChicago CS Researchers Shine at UIST 2024 with Papers, Posters, Workshops and Demonstrations

Xiangyu Zhang (Purdue)- Reducing LLM Hallucination in Program Analysis Tasks