[47] Recovering the Unbiased Scene Graphs from the Biased Ones

paper

TL;DR

task : Scene Graph Generation
problem : Due to the nature of SGG, it is a long tail distribution with a lot of unlabeled data and only certain relations appearing a lot.
idea : Let’s look at the problem from a Positive-Unlabeled Learning perspective and divide the logit value by the frequency of all class labels.
architecture : object detector + GNN?
objective : cross entropy loss
baseline : MOTIFS, …
data : Visual Genome, Visual Genome150
result : It appears to be sgdet SOTA for the current VG150.
Fix the contribution : long-tail issue
Limitations or things I don’t understand :

Details

Recovering the Unbiased Scene Graph

s: labeled pred
y : true pred
r : target pred

unbiased probability

If we assume that the probability of being labeled is independent of x (Selected Completely at Random, SCAR), we can write

p(s=r|y=r) is eventually the ratio of labeled examples to the total class r.

Dynamic Label Frequency Estimation

Get an estimate for p(s=r|y=r) above, i.e., the label frequency.

This expression is derived from

We end up dividing the entire data by frequency by class -.-

it is difficult to obtain post-training estimates before inference and
For SGDET, there is no gt bbox, so it is difficult to estimate a valid example.

So we’ll do data augmentation to get a vaild example for the tail class, and the label frequency will be estimated on a batch-by-batch basis. We’ll call this idea Dynamic Label Frequency Estimation (DLFE).

TL;DR#

Details#

Recovering the Unbiased Scene Graph#

Dynamic Label Frequency Estimation#

TL;DR

Details

Recovering the Unbiased Scene Graph

Dynamic Label Frequency Estimation