Mathematics
Dr. Mukund Sundararajan
Google
Deep networks have recently had remarkable success in a variety of tasks. For instance, they can identify objects in images, perform language translation, enable web search, perform medical diagnosis — all with surprising accuracy. Despite this, their inner workings largely remain a black box to humans. An overarching question that arises is why did the network make this prediction?
In this talk, we will focus on the problem of understanding individual predictions made by deep network. We will discuss a technique for attributing a prediction to its input features. We will then discuss:
- Applications of the technique to a variety of networks (object recognition, text categorization, diabetic retinopathy, etc.).
- Implementation of the technique using a few lines of code.
- How to interpret and visualize the attributions.
- An axiomatic justification of the technique.
We will conclude with a general discussion on what "understandability" means for deep networks.
This is joint work with Ankur Taly and Qiqi Yan. A paper based on this work has been recently accepted at ICML 2017.