image

paper

TL;DR

  • I read this because.. : explanation ํ•˜๋ฉด ์•Œ์•„์•ผ๋˜์ง€ ์•Š์„๊นŒ ํ•˜๊ณ  ์ฝ์Œ
  • task : explainability in CNN
  • problem : ๋ชจ๋“  ์ข…๋ฅ˜์˜ CNN์— ์ ์šฉ๊ฐ€๋Šฅํ•œ interpretableํ•œ ๋ชจ๋“ˆ์„ ๋ถ™์—ฌ๋ณด์ž
  • idea : convolution์˜ activation map $A^k$์„ ์šฐ๋ฆฌ๊ฐ€ ์‹œ๊ฐํ™”ํ•˜๊ณ  ์‹ถ์€ ํด๋ž˜์Šค $y^c$์— ๋Œ€ํ•ด ๋ฏธ๋ถ„ํ•˜๊ณ  GAP๋ฅผ ํ•ด์„œ importance๋ฅผ ๊ตฌํ•œ๋’ค ์ด๊ฑธ $A^k$์— weighted sum + ReLUํ•ด์„œ ๊ตฌํ•œ๋‹ค.
  • input/output : {image, class or caption or answer} -> activation map
  • architecture : VGG-16, AlexNet, GoogleNet
  • objective : X
  • baseline : CAM, Guided-BackProp, c-MWP
  • data : ILSVRC-15, PASCAL VOC 2007
  • evaluation : wsss, human evaluation, pointing game
  • result : ์„ฑ๋Šฅ ์ €ํ•˜ ์—†์ด(CAM์€ ์„ฑ๋Šฅ์ด ์ €ํ•˜๋จ) ํ›Œ๋ฅญํ•œ ์„ค๋ช…๋ ฅ. wsss์—์„œ ์ข‹์€ seed. adversarial sample๋„ ์‹œ๊ฐํ™” ์ž˜ํ•จ. ์‚ฌ๋žŒ ๋ถˆ๋Ÿฌ์„œ activate๋œ ์•  ๋ณด๊ณ  class ๋ถ„๋ฅ˜ํ•˜๋ผ๊ณ  ํ•จ(trustworthy), Guided-backprop ๋˜๋Š” Deconv๋ž‘ ์‚ฌ๋žŒํ•œํ…Œ ๋ญ๊ฐ€ ๋” ๋‚ซ๋ƒ๊ณ  ๋ฌผ์–ด๋ด„
  • contribution : ๊ฐ„๋‹จํ•œ ์•„์ด๋””์–ด๋กœ ์„ฑ๋Šฅ ์ €ํ•˜ ์—†๋Š” de-facto method
  • etc. : negative gradient๋ฅผ ์•ˆ๋ณด๋Š” ๊ด€์Šต์€ ์—ฌ๊ธฐ์„œ ๋‚˜์™”๋‚˜ ๋ณด๋‹น. guided backprop์ด๋ž‘ Network Dissection ์ฝ์–ด๋ณด์žฅ. “counterfactual explanation"์ด๋ž€ ์šฉ์–ด ์ค์ค

Details

proposed

image

์šฐ๋ฆฌ๊ฐ€ ์‹œ๊ฐํ™”ํ•˜๊ณ  ์‹ถ์€ class c์— ๋Œ€ํ•œ logit (softmax ์ด์ „) $y^c$๋ฅผ activation feature map $A_{ij}$์— ๋Œ€ํ•ด ๋ฏธ๋ถ„ํ•จ. ์ด๋ฅผ width, height (i, j)์— ๋Œ€ํ•ด Global Average Pooling ํ•ด์„œ importance๋ฅผ ๊ตฌํ•จ. image

์ด๊ฑธ activation map๊ณผ ๋‹ค์‹œ weighted sumํ•œ ๋’ค์— ReLU๋ฅผ ์ทจํ•˜๋ฉด GradCAM image

์ด๋•Œ ๋งˆ์ง€๋ง‰ ๋ ˆ์ด์–ด์˜ conv feature map (14 x 14 size)๋ฅผ ์‚ฌ์šฉ (์ด์ „ ๋ ˆ์ด์–ด ์“ฐ๋ฉด ์„ฑ๋Šฅ์ด ๋ณ„๋กœ ์ข‹์ง€ ์•Š์Œ) ์—ฌ๊ธฐ์„œ ReLU๋ฅผ ์ ์šฉํ•œ ์ด์œ ๋Š” negativeํ•˜๊ฒŒ ์˜ํ–ฅ์„ ์ฃผ๋Š” pixel์€ ๋‹ค๋ฅธ ์นดํ…Œ๊ณ ๋ฆฌ์— ํ•ด๋‹นํ•˜๋Š” ๊ฒƒ์ผํ…Œ๋‹ˆ ๊ทธ๋Ÿผ. ReLU๋ฅผ ์ ์šฉ์•ˆํ•˜๋‹ˆ๊นŒ ์›ํ•˜๋Š” class $y^c$๊ฐ€ ์•„๋‹Œ ๋‹ค๋ฅธ ํด๋ž˜์Šค๊ฐ€ ํ™œ์„ฑํ™”๋ ๋•Œ๊ฐ€ ์žˆ์—ˆ๊ณ  localization ์„ฑ๋Šฅ์ด ๋–จ์–ด์ง.

  • guided grad-cam 14 x 14 feature map์ด ๋Œ€์ถฉ ์—ฌ๊ธธ ๋ณด๊ณ  ์žˆ๋‹ค๊ณ ๋Š” ์•Œ ์ˆ˜ ์žˆ๋Š”๋ฐ ๊ตฌ์ฒด์ ์œผ๋กœ ์ด๊ฒŒ ์™œ “tiger cat"์ธ์ง€์— ๋Œ€ํ•œ finegrainedํ•œ ์„ค๋ช…์€ ๋ชปํ•จ ๊ทธ๋ž˜์„œ guided backpropagation(Striving for Simplicity: The All Convolutional Net, https://arxiv.org/abs/1412.6806 )๋ผ๋Š” ๊ฑธ ์‚ฌ์šฉํ•ด์„œ ๊ฐ™์ด ๊ณฑํ•ด์„œ ์‹œ๊ฐํ™” ํ•ด์คŒ. Deconv๋ฅผ ์“ธ ์ˆ˜ ์žˆ๋Š”๋ฐ ์‹คํ—˜์ ์œผ๋กœ guided backprop์ด ๋” ์ข‹์•˜๋‹ค๊ณ  ํ•จ. Guided backprop์— ๋Œ€ํ•ด “negative gradients are supressed"๋ผ๊ณ  ์จ์ ธ์žˆ๋Š”๋ฐ ๋ฌด์Šจ ๋‚ด์šฉ์ธ์ง€ ์ฝ์–ด๋ณด์ž

  • counterfactual explanation image

image

๋‹จ์ˆœํžˆ gradient์— negative๋ฅผ ๊ตฌํ•ด์ค€ ๋’ค ReLU๋ฅผ ์ทจํ•˜๋ฉด(negative activation๋งŒ ๋‚จ์„ํ…Œ๋‹ˆ) counterfactual explanation์ด ๋จ. ์ด ํ”ฝ์…€์ด ์ด ํด๋ž˜์Šค๊ฐ€ ์™œ ์•„๋‹Œ์ง€์— ๋Œ€ํ•œ ์„ค๋ช…!

Result

  • classification result
    image

  • result on captioning model image

  • textual explanation on neuron image

Network Dissection: Quantifying Interpretability of Deep Visual Representations https://arxiv.org/abs/1704.05796 ์ด๊ฑฐ ์ฝ์–ด๋ณด์žฅ

  • result with adversarial noise image

์ด๋ฏธ์ง€์— ์‚ด์ง perturbationd์„ ์ทจํ•˜๋ฉด airliner 0.9999๋กœ ์˜ˆ์ธกํ•˜๋Š” ์˜ˆ์‹œ. ๊ทผ๋ฐ ์ด๋ ‡๊ฒŒ ํ•ด๋„ GradCAM์€ ์ž˜๋œ๋‹ค.