image

paper , code

TL;DR

  • task : long-tail object recognition
  • Previous studies only focused on foreground - background and did not address class imbalance within the foreground! Rare classes, whether sigmoid or softmax, are affected by the gradient due to negative samples of frequent classes.
  • IDEA : Give the $log(p_j)$ term of sigmoid / softmax a frequency-based weight before the $log(p_j)$ term.
  • architecture : ResNet-50 Mask R-CNN
  • objective : equalization loss(proposed in this paper)
  • baseline : sigmoid, softmax, class-aware sampling, class balanced loss, focal loss
  • data : LVIS v0.5, CIFAR-100-LT, ImageNet-LT
  • result : Overall performance improvement for AP, AP50 over baseline. Performance for rare, frequent is worse than baseline and common is very good.
  • contribution : Probably the first paper on class imbalance in foreground?

Details

Motivation

image

To the right, the rarer classes have the effect of making the gradient of negative samples higher than positive

Equalization Loss Formulation

image

image

  • $E(r)$ : 1 or 0 if foreground
  • $f_j$: frequency of class j
  • Tresholding $T_\lambda$ : 0 or 1 if $x < \lambda$ tresholding

In this case, $\lambda$ looks at the Tail Ratio (TR) below and realizes that pus => pus => pus is not better or worse in absolute terms, it’s just that frequent <=> rare performs differently depending on the value. image

Softmax Equalization Loss Formulation

image image

  • multiply weight by the denominator only

image

  • $\beta$: Random variable that is 1 with probability $\gamma$ and 0 with probability $1-\gamma

Result

image

Adding it improves performance across the board!

image

Better overall compared to other long-tail losses, but worse than sampling methods for rare, frequent cases Definitely better than Focal!

Ablation

image Higher tail ratio means better for frequent classes and worse for rare -> $\lambda$ is fully hyperparametric

image

Ablation for E(r), replacing 1 if background. rare becomes bad.