image

paper , page

TL;DR

  • I read this because.. : dataset filtering / evaluation์— ๋Œ€ํ•ด ๊ถ๊ธˆํ•ด์„œ ์ฝ์Œ
  • task : CLIP
  • problem : open large image - text set
  • idea : common crawl + study
  • input/output : image / text -> similiarity score
  • architecture : CLIP๊ณผ ๋™์ผ
  • objective : contrastive loss
  • baseline : LAION-2B
  • data : CommonPool 14B -> (filtered) DataComp 1.4B
  • evaluation : zero-shot imagenet / imagenet-A/ .. ์•„๋ž˜ ์ž์„ธํžˆ ์„œ์ˆ  + retrieval
  • result : LAION-2B๋ณด๋‹ค ๋” ๋†’์€ ์„ฑ๋Šฅ
  • contribution : ๋ฐ์ดํ„ฐ์…‹ ๊ณต๊ฐœ. ๋‹ค์–‘ํ•œ filtering ๊ธฐ๋ฒ• ablation. competition์œผ๋กœ ๋ฐ์ดํ„ฐ์— ์ง‘์ค‘ํ•˜๋Š” ์—ฐ๊ตฌ ๋ฐฉํ–ฅ ์ด‰์ง„.
  • etc. :

Details

Evaluation

image
  • zs-image classifcation
  • CLIP ์›๋ž˜ ๋…ผ๋ฌธ์—์„œ ํ‰๊ฐ€ํ•œ 22๊ฐœ ๋ฐ์ดํ„ฐ์…‹
  • 6๊ฐœ์˜ distrbution shift๋œ imagenet : ImaeNet-Sketch, ImageNet-V2, ImageNet-A, ImageNet-O, ImageNet-R, ObjectBet
  • 13๊ฐœ์˜ VTAB ๋ฐ์ดํ„ฐ : https://arxiv.org/pdf/1910.04867.pdf
  • 3๊ฐœ์˜ WILDS ๋ฐ์ดํ„ฐ: benchmark of 10 datasets reflecting a diverse range of distribution shifts that naturally arise in real-world applications, such as shifts across hospitals for tumor identification; across camera traps for wildlife monitoring; and across time and location in satellite imaging and poverty mapping. e.g. WILDS: A benchmark of in-the-wild distribution shifts. iWildCam2020-wilds(์•ผ์ƒ๋™๋ฌผ..), Camelyon17-wilds(์„ธํฌ์กฐ์ง..), RxRx1-wilds(RNA…)
  • WinoGAViL : commonsense association task https://paperswithcode.com/dataset/winogavil ๋ด๋„ ๋ญ”์ง€ ์ดํ•ด๊ฐ€ ์•ˆ๋˜๋„น
  • ๋งˆ์ง€๋ง‰์œผ๋กœ fairness ๋ฐ์ดํ„ฐ ๋‘๊ฐœ : FairFace, UTKFace -> ์ธ์ข… ๋งž์ถ”๋Š” classification

๋ช‡๊ฐ€์ง€ ๋ฐœ๊ฒฌ๋“ค

  • zs retrieval๊ณผ linear probing์˜ ๋†’์€ correlation image

  • ์ž‘์€ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•œ ์„ฑ๋Šฅ๊ณผ ํฐ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•œ ์„ฑ๋Šฅ์˜ ๋†’์€ correlation image

  • imagenet๊ณผ ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ์…‹๊ฐ„์˜ ๋†’์€ correlation image

correlation์ด ๋‚ฎ์€ ์• ๋“ค์˜ ์„ฑ๋Šฅ์ด random guess์™€ ๊ฐ€๊นŒ์› ๋‹ค.

๋‹ค ๋‚œํ•ดํ•˜๊ธฐ ์ง์ด ์—†๋„ค.. ๊ทธ๋‚˜๋งˆ ์—ฌ๊ธฐ์„œ ์“ธ๋งŒํ•œ๊ฑด imagenet-a์™€ country211 ์ •๋„?! ๊ทธ๋ฆฌ๊ณ  ๋‹น์—ฐํ•˜๊ฒŒ๋„ ocr ์ชฝ ๋ฐ์ดํ„ฐ์…‹ (rendered sst2, svhn)๋„ correlation์ด ใ…‡์—†์—ˆ๋‹ค.

c.f. bs๊ณผ ๊ฐ™์€ hparam์— data filtering์˜ rank๋Š” ๊ฑฐ์˜ ๋ฐ”๋€Œ์ง€ ์•Š์•˜๋‹ค image