365
20160

Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization

Abstract

We propose a technique for making CNN-based models more transparent by visualizing the input image regions that are important for predictions from these models- producing visual explanations. Our approach, called Gradient-weighted Class Activation Mapping (Grad-CAM), uses the class-specific gradient information flowing into the final convolutional layer of a CNN to produce a coarse localization map of the regions in the image important for each class. Grad-CAM is a strict generalization of Class Activation Mapping (CAM). Unlike CAM, Grad-CAM is broadly applicable to any CNN-based architectures and needs no re-training. We show how Grad-CAM may be combined with pixel-space visualizations (such as Guided Backprop) to create a high-resolution class-discriminative visualization (Guided Grad-CAM). We generate Grad-CAM and Guided Grad-CAM visualizations to better understand off-the-shelf image classification, image captioning, and visual question answering (VQA) models, including Res-Net based architectures. In the context of image classification models, our visualizations (a) lend insight into model's failure modes, and (b) outperform pixel-space gradient visualizations on the ILSVRC-15 weakly-supervised localization. For image captioning and VQA, our visualizations expose the somewhat surprising insight that common CNN+LSTM models are good at localizing discriminative input image regions despite not being trained on grounded image-text pairs. Finally, through human studies we show that our explanations help users establish trust in the predictions made by deep networks. Interestingly, we find that Guided Grad-CAM helps untrained users successfully discern a stronger deep network from a weaker one even when both make identical decisions. Our code is available at github.com/ramprs/grad-cam/ and a demo is available at gradcam.cloudcv.org. Video of the demo can be found at youtu.be/COjUB9Izk6E.

View on arXiv
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.