[81] Equalization Loss for Long-Tailed Object Recognition

paper , code

TL;DR

task : long-tail object recognition
Previous studies only focused on foreground - background and did not address class imbalance within the foreground! Rare classes, whether sigmoid or softmax, are affected by the gradient due to negative samples of frequent classes.
IDEA : Give the $log(p_j)$ term of sigmoid / softmax a frequency-based weight before the $log(p_j)$ term.
architecture : ResNet-50 Mask R-CNN
objective : equalization loss(proposed in this paper)
baseline : sigmoid, softmax, class-aware sampling, class balanced loss, focal loss
data : LVIS v0.5, CIFAR-100-LT, ImageNet-LT
result : Overall performance improvement for AP, AP50 over baseline. Performance for rare, frequent is worse than baseline and common is very good.
contribution : Probably the first paper on class imbalance in foreground?

Details

Motivation

To the right, the rarer classes have the effect of making the gradient of negative samples higher than positive

Equalization Loss Formulation

$E(r)$ : 1 or 0 if foreground
$f_j$: frequency of class j
Tresholding $T_\lambda$ : 0 or 1 if $x < \lambda$ tresholding

In this case, $\lambda$ looks at the Tail Ratio (TR) below and realizes that pus => pus => pus is not better or worse in absolute terms, it’s just that frequent <=> rare performs differently depending on the value.