image paper

Problem : Swin Transformer์˜ Local Self-Attention(LSA)๋ฅผ Depthwise-Conv(DeConv) ํ˜น์€ Decoupled Dynamic Filter(DDF)๋กœ ๋ฐ”๊พธ์—ˆ์„ ๋•Œ ์„ฑ๋Šฅ์ด ๋” ์ข‹์•˜๋‹ค Solution : DeConv์™€ DDF์™€ LSA๋ฅผ attention ์‹์œผ๋กœ ํ‘œํ˜„ํ•˜๊ณ  ablation study๋ฅผ ํ•จ. head๋ฅผ ๋Š˜๋ฆฌ๋Š” ๊ฒƒ๊ณผ sliding ๋ฐฉ๋ฒ•์ด ์„ฑ๋Šฅ์— ์ค‘์š”ํ•˜๋‹ค๋Š” ๊ฒƒ์„ ๋ฐํ˜€๋ƒˆ๊ณ  ์ด๋ฅผ ์œ„ํ•ด ghost-head, dot-product๋ณด๋‹ค ํšจ์œจ์ ์ธ hadamard attention์„ ์ œ์•ˆํ•จ. Result : LSA์™€ ์œ ์‚ฌํ•œ ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ๋” ๋†’์€ FLOPS, ๋ถ„๋ฅ˜ํƒœ์Šคํฌ์—์„œ SwinTransformer์˜ ์„ฑ๋Šฅ ๊ฐœ์„  ๋А๋‚€ ์  : local window๋ณด๋‹ค๋Š” neighboring window(=sliding window)๊ฐ€ ์„ฑ๋Šฅ์ด ๋” ์ข‹๋‹ค.. ์ง€๋‚œ ๋…ผ๋ฌธ๋•Œ ๋“ค์—ˆ๋˜ ๋А๋‚Œ์ฒ˜๋Ÿผ ์ ์  CNN์˜ ๋ฐฉ๋ฒ•๋ก ์„ ๋” ์ ์šฉํ•˜๋ฉด ์ ์šฉํ•  ์ˆ˜๋ก ์ข‹์•„์ง… details : paper summary