SAFARI Seminar: Skanda Koppula, September 14 2023

Join us for our upcoming SAFARI Seminar:

Speaker: Skanda Koppula, Google DeepMind

Date: Thursday, September 14 2023, 14:00 Zurich time (CEST)
Where: ETZ E9

Title: An Introduction to Multimodal Understanding: Building Models to See, Hear, and Read the World

Modern language models like ChatGPT, Claude, LLAMA, and Bard demonstrate capabilities that are broadly useful, largely powered by improved techniques for scaling data and models. This current generation of models is language-only. What might be required for the next frontier of large-model intelligence: general understanding of any multimodal content? I will talk about the basics of models that handle rich multimodal content, and prior works that attempt to weave together image, text, audio, and video understanding. Finally, I touch on what might be possible opportunities for computer architecture researchers, and the strong synergy between these large models and the underlying hardware, that enables these models to be served at scale.

Speaker Bio:
Skanda Koppula is a research engineer and technical team lead at Google DeepMind. He is broadly interested in multimodal learning (focusing on video, images, and language understanding), and previously has worked on topics in computer architecture and security. Previously, he worked with SAFARI and Professor Onur Mutlu at ETH Zürich, and studied at MIT, obtaining his BS/MEng.

