
TL;DR
- task : instance segmentation
- problem : segmentation annotation cost is too high! weakly-supervised performs only 85% of supervised
- idea : Let’s do point level annotation! Annotate the bbox first, then take 10 random dots and let the annotator binary label them as background or object.
- architecture : mask RCNN
- objective : bi-linear interpolate the prediction for 10 points and then cross entropy loss
- baseline : fully supervised mask RCNN
- data : ImageNet, COCO
- result : ImageNet performs about 97% of supervised, COCO performs 99%.
- contribution : The original segmentation takes about 79 seconds per piece, but this methodology allows for annotation in 7 seconds.
- Limitation or part not understood : PointRend model part not read
Details


augmentation Use normal image augmentations + randomly sample 5 out of 10 at each training epoch and use only those.
Difference between dice loss and IoU https://stackoverflow.com/questions/60268728/why-dice-coefficient-and-not-iou-for-segmentation-tasks

It’s like using dice for segmentation and iou for object detection. As if there is no particular reason for this?