image

paper

TL;DR

  • I read this because.. : multi-task learning with uncertainty!
  • task : semantic segmentation, instance segmentation, pixel-wise metric depth
  • problem : ์ด์ „์˜ ๋ฉ€ํ‹ฐํƒœ์Šคํฌ ์ ‘๊ทผ๋ฒ•์€ loss๋“ค์˜ ๊ฐ€์ค‘ํ•ฉ์ธ๋ฐ ์ด ๊ฐ€์ค‘์— ๋”ฐ๋ผ ์„ฑ๋Šฅ์ด ๋งค์šฐ ์˜ˆ๋ฏผํ•˜๊ฒŒ ์›€์ง์ธ๋‹ค.
  • idea : output y์— ๋Œ€ํ•ด ๊ฐ€์šฐ์‹œ์•ˆ์œผ๋กœ ๊ฐ€์ •ํ•˜๊ณ  MLE์— ๋”ฐ๋ผ ์ถ”์ •ํ•˜๋ฉด $\sigma$์— ์˜ํ•ด ๊ฐ task ์ž์ฒด์˜ noise์™€ ์ƒ๋Œ€์ ์ธ weight๋ฅผ ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค. ์ฆ‰ model weight $W$์™€ task dependent $\sigma_{task}$๋ฅผ ๊ฐ™์ด ์ตœ์ ํ™”ํ•˜์ž.
  • architecture : DeepLab V3(ResNet101 -> Atrous Spatial Pyramid Pooling) + 3๊ฐœ ํƒœ์Šคํฌ์— ๋งž๋Š” decoder
  • objective : CE(semantic segmentation), L1(instance segmentation, depth estimation)
  • baseline : task specific model, weighted multi-task model
  • data : CityScapes benchmark, depth image๋Š” SGM์ด๋ผ๋Š” ๋ชจ๋ธ๋กœ pseudo-label ์‚ฌ์šฉ
  • evaluation : IoU, Instance Mean Error, Inverse Depth Mean Error
  • result : 3๊ฐœ์˜ ํƒœ์Šคํฌ๋กœ ํ•™์Šตํ•œ๊ฒŒ segmentation, depth ์˜ˆ์ธก์—์„œ sota. instance segmentation์€ 2๊ฐœ๋กœ ํ•™์Šตํ•œ ๊ณณ์—์„œ sota
  • contribution : 3 ํƒœ์Šคํฌ๋กœ ํ•™์Šตํ•œ ๋ชจ๋ธ์ด ์ฒ˜์Œ์ด๋ผ๊ณ  ํ•˜๋„น
  • limitation / things I cannot understand : ๋Œ€์ถฉ ๊ฒฐ๋ก ์ ์œผ๋กœ ๋ณด๋ฉด ํ•™์Šต ๊ฐ€๋Šฅํ•œ weight ์ถ”๊ฐ€ํ•˜๊ณ  ์ด๊ฒŒ ๋„๋›ฐ๊ธฐ ๋˜์ง€ ์•Š๋„๋ก Regularization term ์ถ”๊ฐ€ํ•œ๊ฑด๋ฐ mle ๊ด€์ ์œผ๋กœ ํ•ด์„๋˜๋‹ˆ๊นŒ ๋ณด๊ธฐ์— ์•„๋ฆ„๋‹ต๋„น

Details

motivation

image

multi-task loss weight์— ๋”ฐ๋ผ ์„ฑ๋Šฅ์ด ๋„๋›ฐ๊ธฐ ํ•จ

Architecture

image

Homoscedastic uncertainty as task-dependent uncertainty

  • Epistemic uncertainty
    • model์— ์˜ํ•œ uncertainty, training data์˜ ๋ถ€์กฑ์œผ๋กœ ์ธํ•œ Uncertainty
  • Aleatroic uncertainty
    • ๋ฐ์ดํ„ฐ์— ์˜ํ•œ uncertainty, data๊ฐ€ ํ‘œํ˜„ํ•  ์ˆ˜ ์—†๋Š” ์ •๋ณด์— ๋Œ€ํ•œ uncertainty.
      • Data-dependent, Hetroscedatic
        • input data์™€ ๋ชจ๋ธ ์•„์›ƒํ’‹์— ์˜ํ•ด ๊ฒฐ์ •๋˜๋Š” uncertainty
      • Task-dependent, Homoscedastic
        • input data์— ์˜์กดํ•˜์ง€ ์•Š๋Š” uncertainty

๋ญ๋ผ๋Š”์ง€ ์•ˆ์™€๋‹ฟ๋„ค.. ์–ด์จŒ๋“  ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋งˆ์ง€๋ง‰ task-dependent uncertainty์— ๋Œ€ํ•ด ์ธก์ •ํ• ๊ฑฐ์ž„

Multi-task likelihoods

๋‰ด๋Ÿด๋„คํŠธ์›Œํฌ์˜ ์•„์›ƒํ’‹์„ $f^W(x)$๋ผ๊ณ  ํ•˜์ž. regression ๋ฌธ์ œ์—์„œ๋Š” Output์„ ๊ฐ€์šฐ์‹œ์•ˆ์„ ๋”ฐ๋ฅด๋Š” ๊ฑธ๋กœ ๊ฐ€์ •ํ•  ์ˆ˜ ์žˆ์Œ image

์ด๋•Œ $\sigma$๋Š” Noise scalar

๋ถ„๋ฅ˜๋ฌธ์ œ์— ๋Œ€ํ•ด์„œ๋Š” softmax๋ฅผ ์ทจํ•ด์„œ ํ™•๋ฅ ๋ถ„ํฌ๋กœ ๋งŒ๋“ฆ image

multiple-model output์— ๋Œ€ํ•ด์„œ๋Š” factorizeํ•ด์„œ ์ด๋ ‡๊ฒŒ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์Œ. image

maximum likelihood estimation์— ๋”ฐ๋ฅด๋ฉด Log likelihood๋Š” ์ด๋ ‡๊ฒŒ ์“ธ ์ˆ˜ ์žˆ์Œ image

๋‘๊ฐœ์˜ gaussian์„ ๋”ฐ๋ฅด๋Š” ๋ชจ๋ธ ์•„์›ƒํ’‹์— ๋Œ€ํ•œ Log likehlihood์— ๋Œ€ํ•ด์„œ๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ์“ธ ์ˆ˜ ์žˆ์Œ image

์ด๋Š” ์ด์ œ $\mathcal{L}(W, \sigma_1, \sigma_2)$์— ๋Œ€ํ•œ minimisation ๋ฌธ์ œ๋กœ ๋ณผ ์ˆ˜ ์žˆ์Œ image

์ด๋ ‡๊ฒŒ ๋˜๋ฉด $\sigma_1$, $\sigma_2$๋Š” ๊ฐ loss 1, 2์˜ ์ƒ๋Œ€์ ์ธ Weight๊ฐ€ ๋˜๊ณ , ๋งˆ์ง€๋ง‰ ํ•ญ์ธ $log\sigma_1\sigma_2$๋Š” regularization term์ด ๋œ๋‹ค.

๋ถ„๋ฅ˜ ๋ฌธ์ œ์— ๋Œ€ํ•ด์„œ๋Š” scalar $\sigma$๋กœ scale๋œ softmax๋กœ ํ™•์žฅ์‹œ์ผœ์„œ ๋ณด์ž. image

์ด๋ ‡๊ฒŒ ๋˜๋ฉด log likelihood๋Š” ์•„๋ž˜์™€ ๊ฐ™์€ ๊ผด์ด ๋˜๊ณ , image

์ด๋Š” ๋‹ค์‹œ joint loss๋ฅผ ํ•™์Šตํ•˜๋Š” ๋ชจ์–‘์ด ๋œ๋‹ค. image

์—ญ์‹œ ์—ฌ๊ธฐ์„œ๋„ $\sigma_1$, $\sigma_2$๊ฐ€ ๋ชจ๋ธ์˜ ์ƒ๋Œ€์ ์ธ weight๋กœ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

Result

image image