image

paper

TL;DR

  • task : case-based reasoning
  • problem : The existing ProtoPNet assumes a prototype for each class, and since the optimization is divided into multiple steps and looks at the absence of a prototype when deciding on a class, the prototype becomes vague.
  • idea : 1) allow prototypes to be shared between classes 2) make it differentiable to assign a prototype to a class 3) define a focal similarity function so that backgrounds, etc. are not made from prototypes
  • architecture : Looks exactly like ProtoPNet. CNN => get focal similarity with prototype => soft assign with gumbel softmax => pooling => classifier
  • objective : CE loss(h learning) + orthogonality loss
  • baseline : ProtoPNet
  • data : CUB-200-2011, Standford Cars Data
  • result : SOTA, capture more salient features
  • contribution : reduce prototypes
  • limitation or part I don’t understand : fig 3 doesn’t make sense to me exactly… prototype pool is shared but there are separate slots for each class and those prototypes are categorized?

Details

Preliminaries: ProtoPNet(prototypical part network)

This Looks Like That: Deep Learning for Interpretable Image Recognition I want to visualize why this image was categorized into this class. image

Given an image x, draw f(x) with a CNN and get H x W x D as CNN output At the same time, m prototypes have $H_1$ x $W_1$ x D shape, which must be smaller than H and W. In this case, the D dimensions are the same, but the height and width are smaller, so each prototype can be used like a CNN patch to get an activation map

image

The overall ProtoPNet structure is the same as above. Learning is divided into three phases (1) Stochastice gradient descent(SGD) of layers before last layer learns a prototype P and a convolution filter. loss learns the minimum distance between the last classification loss and the patches in the prototype and convolution output to be closer if they are of the same class and farther if they are of different classes. image

(2) Projection of prototypes Assigns prototype to be the closest patch in the same class as the prototype image

(3) Convex optimization of last layer Freeze the prototype and CNN and learn the matrix for h. image

  • The logic behind the model’s classification image

  • How it differs from CAM or partial attention image

motivation

image

Architecture

image

Each class has K slots, so you can assign shared prototypes to them

Given an image x, draw the output H x W x D with CNN(=f(x)) This can be interpreted as having H x W vectors in D dimensions. We can assign those D-dimensional vectors to the kth slot by finding their similarity

Focal similarity

Previous studies, such as ProtoPNet, have obtained similarity as follows image

image

However, this can lead to (1) all the patches of f(x), z, being prototypically similar, i.e., focusing only on the background, and (2) gradienting only the activated elements in the image. To avoid this, we propose focal similarity image

image

Assigning one prototype per slot

gumbel-softmax to soften the prototype instead of hard assigning it and allow the gradient to flow

image

Add more LOS to prevent multiple prototypes from fitting in one slot. image