January 22, 2021
11:00am - 12:00pm
Donald Bren Hall 6011
Learning compositional, structured, and interpretable models of the world
Despite their fantastic achievements in fields such as Computer Vision and Natural Language Processing, state-of-the-art Deep Learning approaches differ from human cognition in fundamental ways. While humans can learn new concepts from just a single or few examples, and effortlessly extrapolate new knowledge from concepts learned in other contexts, Deep Learning architectures generally rely on large amounts of data for their learning. Moreover, while humans can make use of contextual knowledge of e.g. laws of nature and insights into how others reason, this is highly non-trivial for a regular Deep Learning algorithm. There are indeed plenty of applications where estimation accuracy is central, where regular Deep Learning architectures are purposeful. However, models that learn in a more human-like manner have the potential to be more adaptable to new situations, more data efficient and also more interpretable to humans - a desirable property for Intelligence Augmentation applications with a human in the loop, e.g. medical decision support systems or social robots. In this talk I will describe a number of projects in my group where we explore object affordances, disentanglement, multimodality, and cause-effect representations to accomplish compositional, structured, and interpretable models of the world.
Hedvig Kjellström is a Professor in the Division of Robotics, Perception and Learning at KTH in Stockholm, Sweden. She received an MSc in Engineering Physics and a PhD in Computer Science from KTH in 1997 and 2001, respectively. The topic of her doctoral thesis was 3D reconstruction of human motion in video. Between 2002 and 2006 she worked as a scientist at the Swedish Defence Research Agency, where she focused on Information Fusion and Sensor Fusion. In 2007 she returned to KTH, pursuing research in activity analysis in video. Her present research focuses on methods for enabling artificial agents to interpret human behavior and reasoning, and also to behave and reason in ways interpretable to humans. These ideas are applied in performing arts, healthcare, veterinary science, and smart society. In 2010, she was awarded the Koenderink Prize for fundamental contributions in Computer Vision for her ECCV 2000 article on human motion reconstruction, written together with Michael Black and David Fleet. She has written around 100 papers in the fields of Computer Vision, Machine Learning, Robotics, Information Fusion, Cognitive Science, Speech, and Human-Computer Interaction. She is mostly active within Computer Vision, where she is an Associate Editor for IEEE TPAMI and regularly serves as Area Chair for the major conferences.