image

paper

TL;DR

  • I read this because.. : https://github.com/long8v/PTIR/issues/82 ์ด๋ž‘ ๋‹ค๋ฅธ ๋…ผ๋ฌธ์—์„œ ์–ธ๊ธ‰๋˜๋Š”๊ฑธ ์ข…์ข… ๋ด„. ๋‘ ๋‹จ์–ด ์ œ๋ชฉ ๊ฐ„์ง€
  • task : input์ด๋‚˜ output์ด ์ˆœ์„œ์™€ ์ƒ๊ด€์—†๋Š” set์ธ task๋“ค. 1) ๋ชจ์ˆ˜์˜ ๋ถ„ํฌ parameter ์ถ”์ • 2) ์ˆซ์ž๋“ค ๋‚˜์—ดํ•˜๊ณ  ์ดํ•ฉ ๊ตฌํ•˜๊ธฐ 3) point cloud classification 4) ์–ด๋–ค ๋‹จ์–ด set์˜ concept / cluster์™€ ๊ฐ€๊นŒ์šด ๋‹จ์–ด๋“ค ์ฐพ๊ธฐ 5) ์ด๋ฏธ์ง€์™€ ๊ด€๋ จ๋œ tag๋“ค์„ ๋ชจ๋‘ ์ฐพ๊ธฐ
  • problem : permutation invariant task๋“ค์„ ํ‘ธ๋Š” deep network๊ฐ€ ๊ฐ€์ ธ์•ผ ํ•˜๋Š” ํŠน์„ฑ์ด ๋ญ๊ฐ€ ์žˆ๋Š”์ง€ ์•Œ์•„๋ณด์ž.
  • architecture : $f(x)=\sigma(\lambda I \mathbf{x} + \gamma \text{maxpool}(\mathbf{x})1)$
  • result : ํ•˜๋‚˜์˜ arch๋กœ ๊ฐ๊ฐ ํŠน์„ฑํ™”๋œ ๋ชจ๋ธ๊ณผ ์œ ์‚ฌํ•˜๊ฑฐ๋‚˜ ๋” ๋‚˜์€ ์„ฑ๋Šฅ
  • contribution : set input output์— ๋Œ€ํ•œ ์ด๋ก ์  ํŠน์„ฑ ๋ถ„์„, ๋‹ค์–‘ํ•œ application์—์„œ ์„ฑ๋Šฅ ํ™•์ธ
  • limitation / things I cannot understand :

Details

Permutation Invariance and Equivarnce

Problem Definition

function f๋Š” set์˜ ์ˆœ์„œ์™€ ์ƒ๊ด€์—†์ด permutaion invariantํ•ด์•ผ ํ•œ๋‹ค.

image
  • $\pi$ : permutation

Structure

set $X$๋ฅผ ๋ฐ›๋Š” function f(X)๋Š” ์•„๋ž˜์™€ ๊ฐ™์€ form์œผ๋กœ decompose๋  ๋•Œ pemutation invariantํ•˜๋‹ค image

์–ด๋–ค function $f_\theta : \mathbb{R}^M \rightarrow \mathbb{R}^M$์ผ ๋•Œ,

  • $\sigma$ : nonlinearity function
  • $\theta \in \mathbb{R}^{M\times M}$ $f_\theta(\mathbf{x})=\sigma(\theta\mathbf{x})$ ์ผ ๋•Œ, $\theta$์˜ ๋Œ€๊ฐ์„  ์š”์†Œ๊ฐ€ ๊ฐ™๊ณ  ๋Œ€๊ฐ์„  ์š”์†Œ๊ฐ€ ์•„๋‹Œ ๊ฒƒ๋“ค์ด tie๋˜์–ด ์žˆ์„ ๋•Œ permutation equivarant ํ•˜๋‹ค. image

์ˆ˜์‹ ๋ณด๋‹ˆ๊นŒ ๊ทธ๋ƒฅ diagonal ๋งŒ ๋นผ๊ณ  ๋‹ค ๊ฐ™์€ ๊ฐ’์ด๊ณ  diagnoal ๋ผ๋ฆฌ๋„ ๋‹ค ๊ฐ™์œผ๋ฉด ๋˜๋Š”๋“ฏ lambda * torch.eyes(5) + gamma * torch.ones(5,5)

$\mathbf{x}$๊นŒ์ง€ ๋„ฃ์œผ๋ฉด $f(x)=\lambda Ix \mathbf{(11^T)x})$ input Ix์™€ x์˜ summation์—๋‹ค๊ฐ€ nonlinearity ์ทจํ•œ๊ฒŒ permutation invariantํ•˜๋‹ค(summation์ด permutation๊ณผ ์ƒ๊ด€์—†์œผ๋‹ˆ)

Deep Sets

์œ„์—์„œ ์ •๋ฆฌํ•œ ํŠน์„ฑ๋“ค์„ univeral approximator๋กœ ๋ฐ”๊พธ๋ฉด ๋œ๋‹ค. ์ฆ‰, $\phi$์™€ $\rho$๋ฅผ polynomial๋กœ ๊ทผ์‚ฌํ•˜๋ฉด ๋œ๋‹ค ์ฆ‰ 1) ๊ฐ๊ฐ์˜ instance $x_m$์€ ์–ด๋–ค ํ‘œํ˜„ $\phi(x_m)$์œผ๋กœ ๋ฐ”๋€Œ๊ณ  2) ๊ทธ ํ‘œํ˜„๋“ค์€ $\rho$ network์— ๋”ฐ๋ผ ์ฒ˜๋ฆฌ๋œ ๋’ค ๋”ํ•ด์ง€๊ฒŒ ๋œ๋‹ค. ์–ด๋–ค ๋ฉ”ํƒ€์ •๋ณด $z$๊ฐ€ ์žˆ์„ ๊ฒฝ์šฐ ์œ„์˜ ๋„คํŠธ์›Œํฌ๋“ค์ด condition์ด ์žˆ๋Š” mapping $\phi(x_m|z)$๋กœ ํ‘œํ˜„๋˜๊ฒŒ ๋œ๋‹ค.

Equivariant model image

์ด๋ฅผ ๋‹ค๋ฅธ ์—ฐ์‚ฐ์œผ๋กœ ์น˜ํ™˜ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ดํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ,

image

max-pool์ด sum๊ณผ ๋น„์Šทํ•˜๊ฒŒ ๊ตํ™˜๋ฒ•์น™์ด ์„ฑ๋ฆฝํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ์‹ค์ œ ์ ์šฉํ•ด๋ดค์„ ๋•Œ sum๋ณด๋‹ค Max์—ฐ์‚ฐ์ด ๋” ์„ฑ๋Šฅ์ด ์ข‹์•˜๋‹ค.

Applications and Empirical Results

  • ์ •๊ทœ๋ถ„ํฌ ๋‚œ์ˆ˜๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ๋ชจ์ˆ˜ ํ†ต๊ณ„ ์ถ”์ • image

  • ์ˆซ์ž๋“ค ๋‚˜์—ด ๋ณด์—ฌ์ฃผ๊ณ  summation ๊ตฌํ•˜๋ผ ํ•จ text / mnist ์ด๋ฏธ์ง€ image

ํ•™์Šตํ•  ๋•Œ๋Š” ์ตœ๋Œ€ 10๊ฐœ ๋ณด์—ฌ์ฃผ๊ณ  test ์‹œ์—๋Š” 100๊ฐœ๊นŒ์ง€ ๋ณด์—ฌ์คŒ Deep Set์ด RNN ๊ณ„์—ด๊ณผ ๋‹ฌ๋ฆฌ ์ผ๋ฐ˜ํ™”๊ฐ€ ์ž˜๋จ

  • point cloud classification image

LiDAR์—์„œ ์ธก์ •๋˜๋Š” point๋“ค์€ ์ˆœ์„œ๊ฐ€ ๋”ฑํžˆ ์—†์Œ.

  • text set expansion cheetah, tiger๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ๋น„์Šทํ•œ concept์„ ๊ฐ€์ง„ puma๋ฅผ ๋ฝ‘๋Š” ํƒœ์Šคํฌ. unsupervised image

  • image tagging ํŠน์ • ์ด๋ฏธ์ง€์— ํ•ด๋‹นํ•˜๋Š” ํ…์ŠคํŠธ ํƒœ๊ทธ๋“ค์„ ๋ชจ๋‘ ๋‹ฌ๊ธฐ ํ•™์Šตํ•  ๋•Œ๋Š” ํƒœ๊ทธ๋“ค ๋ช‡๊ฐœ๋ฅผ ์ฃผ๊ณ  ๋‚˜๋จธ์ง€ ํƒœ๊ทธ๋“ค์„ ์˜ˆ์ธกํ•˜๋ผ๊ณ  ํ•˜๊ณ  ํ…Œ์ŠคํŠธ ์‹œ์—๋Š” ์ด๋ฏธ์ง€๋งŒ ์ฃผ๊ณ  ํƒœ๊ทธ๋“ค์„ ์˜ˆ์ธกํ•˜๋„๋ก ํ–ˆ์Œ ๊ฐ ์š”์†Œ(์ด๋ฏธ์ง€์™€ ํƒœ๊ทธ)๋ฅผ ์ธ์ฝ”๋”ฉ ํ•˜๋Š” ๋„คํŠธ์›Œํฌ ํ•˜๋‚˜, ๊ทธ ์š”์†Œ๋“ค์˜ ํ•ฉ์„ ํ†ตํ•ด set์˜ ์ ์ˆ˜๋ฅผ ๊ตฌํ•˜๋Š” ๋„คํŠธ์›Œํฌ๊ฐ€ ํ•˜๋‚˜ ์žˆ์Œ. -> ๊ทธ๋Ÿฌ๋ฉด ๋ชจ๋“  set์˜ ์กฐํ•ฉ์˜ score๋ฅผ ๊ตฌํ•ด์„œ best๋ฅผ ๋ฝ‘์€๊ฑด๊ฐ€? ๋ชจ๋ฅด๊ฒ ์Œ image

  • anomaly detection CelebA์— ์ด๋ฏธ์ง€์™€ ๊ทธ ์ด๋ฏธ์ง€์— ํ•ด๋‹นํ•˜๋Š” tag๋“ค์ด ๋‹ฌ๋ ค์žˆ๋Š”๋ฐ ํƒœ๊ทธ๋ณ„๋กœ ์ด๋ฏธ์ง€ ๋ชจ์•„๋†“๊ณ  ํ•œ๊ฐœ๋งŒ ๋‹ค๋ฅธ ๊ทธ๋ฃน์—์„œ ์ด๋ฏธ์ง€ ๋ฝ‘์Œ. ์ด๋ฏธ์ง€ ์‹œํ€€์Šค๋ฅผ ๋ฐ›๊ณ  ๋งˆ์ง€๋ง‰ softmax ๋ ˆ์ด์–ด์—์„œ ๋ช‡๋ฒˆ์งธ๊ฐ€ ์ž˜๋ชป๋œ ์ด๋ฏธ์ง€์ธ์ง€ ์˜ˆ์ธกํ•˜๋„๋ก ํ•จ. Deep sets์„ ์“ฐ๋ฉด test์˜ 70%๋ฅผ ๋งž์ท„๋Š”๋ฐ FCN์„ ์“ด basline์€ random guess ์ˆ˜์ค€์˜ ์„ฑ๋Šฅ. image

  • ํ›„์†์—ฐ๊ตฌ? http://proceedings.mlr.press/v97/lee19d/lee19d.pdf pooling ๋Œ€์‹ ์— attention ์—ฐ์‚ฐ์œผ๋กœ!