image

paper

TL;DR

problem : In an image classification task, I want a new class to perform well without any fine-tuning even if there are few data for it (few-shot classification). solution : 1) Apply episode training to train with C classes as the support set and the remaining classes as the query set 2) Train an encoder that extracts features from the image and a relation module that combines the extracted query with the support and predicts whether the two vectors are related (0~1) 3) The loss is 1 if the two query-support sets come from the same class, or 0 and the MSE between the relation. result : A unified, simple, effective architecture for few-shot / zero-shot while improving few-shot performance.

Details

episode training

image

model architecture

image image image

zero-shot learning

It is similar to one-shot in that we are given a single vector for a given class C, but unlike one-shot, the support set is not an image but a semantic class embedding (e.g., textual information in the case of CUB data). This means that the same case of zero-shot can be applied to ZSL by changing the support set to cover a separate modality.

why effective?

Previous studies have been ineffective because they only learn features and the metric is fixed to euclidean or cosine.

result