
TL;DR
- task : probabilistic object detection
- problem : bbox prediction distribution based on NLL loss tends to have high entropy regardless of whether the bbox is correct.
- idea : use energy score instead of NLL loss -> lower entropy, better calibrated
- architecture : RetinaNet, Faster-RCNN, DETR
- objective : Energy Score
- baseline : NLL loss, Direct Moment Matching(DMM)
- data : COCO, Open Images
- evaluation : Proposed a new metric to replace mAP. False positive if IoU<0.1 among GT matched bboxes, localization error if 0.1 ~ 0.5, and if there are multiple GT matched bboxes above 0.5, the highest class score is true positive, and the rest are separated into duplicates. Like mAP, the average value is obtained by thesholding from 0.5 ~ 0.95. Mean Calibration Error (MCE) and regression Calibration Error (CE) are also obtained.
- result : better calibrated, lower entropy, higher quality predictive distribution
- contribution : Propose a new evaluation
- limitation or something I don’t understand : local-rule? non-local rule? entropy is bad if it’s high…
Details
Preliminaries
- energy Calling E(x) energy when p(x) is proportional to exp(-E(x))
- scoring rule A function that measures how good a distribution that predicts a class or bounding box given a feature is given actual observed events.
- variance network https://github.com/long8v/PTIR/issues/92
Negative Log Likelihood as a scoring rule
NLL under Multivariate Gaussian

Energy Score(ES)

- $z_n$ : ground truth bounding box
- $z_{n,i}$ : $i^{th}$ samples drawn from $N(\mu(x_n, \theta), \sigma(x_n, \theta))$.
Monte Carlo can be approximated as follows

Direct Moment Matching

Motivation

- NLL or energy score or the value at which the minimum is similar
- NLL and ES are opposite, with NLL penalizing more when entropy is low ($\sigma$ is low) and ES penalizing more when entropy is high.
- So NLL tends to learn high entropy whether bbox is correct or incorrect -> so why is that bad, I think I need to understand variance network.
Results
