image

paper

TL;DR

  • task : long-tail visual recognition
  • problem : ํ•™์Šต ์‹œ์—๋Š” ํด๋ž˜์Šค๋‹น ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜๊ฐ€ ๋ถˆ๊ท ํ˜•ํ•˜๊ณ  test์‹œ์—๋Š” ๊ท ํ˜•์ธ ๊ฒฝ์šฐ์˜ ๋ฌธ์ œ๊ฐ€ long-tail.
  • idea : classifier์˜ margin์ด ํด๋ž˜์Šค๋‹น ๊ฐœ์ˆ˜๊ฐ€ ๋งŽ์€ ๊ณณ์—์„œ ๋” ์ปค์ง„๋‹ค. margin์„ ์กฐ์ •ํ•ด์ค„์ˆ˜ ์žˆ๋„๋ก beta๋ฅผ ๊ณฑํ•ด์ฃผ๊ณ  gamma๋ฅผ ๋”ํ•ด์ค€๋‹ค. ์ด ๊ณผ์ •์„ ๊ทธ๋ƒฅ ์ „์ฒด imbalance ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต์‹œํ‚จ ๋‹ค์Œ์— beta์™€ gamma์— ๋Œ€ํ•ด์„œ๋งŒ ๋‹ค์‹œ ํ•™์Šตํ•œ๋‹ค.
  • architecture : ResNet32, ResNeXt50, ResNet152, ResNet50
  • objective : cross entropy loss + loss re-weighting
  • baseline : softmax, data re-sampling, loss function engineering, decision Boundary Adjustment …
  • data : CIFAR-LT, ImageNet-LT, Places-LT, iNaturalist-LT
  • result : SOTA!
  • contribution : ์•„์ฃผ ๊ฐ„๋‹จํ•œ ๊ตฌํ˜„์œผ๋กœ SOTA!
  • limitation or ์ดํ•ด ์•ˆ๋˜๋Š” ๋ถ€๋ถ„ : test์‹œ์—๋„ train๊ณผ ๊ฐ™์€ ํด๋ž˜์Šค ๋ถ„ํฌ๋ฅผ ๊ฐ€์งˆ ๋•Œ๋„ ์„ฑ๋Šฅ์ด ์ข‹์•„์ง€๋Š”์ง€ ๋ชจ๋ฅด๊ฒ ์Œ!

Details

  • data re-sampling head ํด๋ž˜์Šค๋ฅผ undersampling, tail ํด๋ž˜์Šค๋ฅผ oversampling
  • loss function engineering ํด๋ž˜์Šค๋ณ„๋กœ loss๊ฐ€ ๋” ๊ท ํ˜•์žˆ๊ฒŒ ๋ถ€๊ณผ๋˜๋„๋ก loss re-weighting. ๋˜๋Š” logit์„ ์กฐ์ •
  • decision boundary adjustment ์›๋ž˜ ๋ฐ์ดํ„ฐ ๋ถ„ํฌ๋Œ€๋กœ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ด ์ข‹์€ ํ‘œํ˜„์„ ๋งŒ๋“ค์ง€๋งŒ classifier ๋ถ€๋ถ„์ด ์„ฑ๋Šฅ์˜ ๋ณ‘๋ชฉ์ด๋‹ค๋Š” ๋ถ„์„์ด ์žˆ์Œ. ํ•™์Šต์€ ์›๋ž˜๋Œ€๋กœ ํ•˜๊ณ  classifier๋ฅผ ์กฐ์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ๋“ค, Platt scaling ๊ฐ™์€ ๋ฐฉ๋ฒ•์„ ์“ฐ๋Š” ๋ฐฉ๋ฒ•๋“ค์ด ์žˆ์Œ.

Paper details

  • margin image

  • margin์„ ์•„๋ž˜์™€ ๊ฐ™์ด ํ‘œํ˜„ ๊ฐ€๋Šฅ image

  • logit์€ margin์— ๋Œ€ํ•œ ์‹์œผ๋กœ ํ‘œํ˜„ ๊ฐ€๋Šฅ -> n์ด ์ปค์ง€๋ฉด margin์ด ์ปค์ง€๊ณ  logit๋„ ์ปค์ง image

  • ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก (MARC)์˜ pseudo-code image

  • loss re-weighting๋„ ์ ์šฉํ–ˆ๋‹ค๊ณ  ํ•จ image

  • ์ „์ฒด ํ•™์Šต ๊ณผ์ •์— ๋Œ€ํ•œ pseudo-code image

Result

image