image

paper

idea : multi-task๋ฅผ ํ•  ๋•Œ, ๊ฐ task๋“ค์˜ relation๋ช…์‹œ์ ์œผ๋กœ ์ฃผ์ง€ ์•Š์•„๋„ ์•Œ์•„์„œ modeling ํ•  ์ˆ˜ ์žˆ๋Š” multi-gate MoE(MMOE)๋ฅผ ๋งŒ๋“ค์ž

image

์ผ๋ฐ˜์ ์ธ multi-task learning์„ ํ•  ๋•Œ, ๊ณต์œ ๋˜๋Š” ๋„คํŠธ์›Œํฌ(shared bottom)๊ฐ€ ์žˆ๊ณ  ์œ„์— ๊ฐ task ๋ณ„๋กœ FCN์„ ์Œ“๋Š” ์‹์œผ๋กœ ๋˜์–ด์žˆ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์—ฌ๊ธฐ์— MoE ์•„์ด๋””์–ด๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ ๊ฐ expert๋“ค์„ shared bottom์œผ๋กœ ์‚ฌ์šฉํ•˜๋„๋ก ํ•œ๋‹ค. ์—ฌ๊ธฐ์— ์›๋ž˜ MoE๋Š” ํ•˜๋‚˜์˜ gating network๊ฐ€ ์žˆ๋Š”๋ฐ MMoE์—์„œ๋Š” ๊ฐ task k๋ณ„๋กœ gating network๋ฅผ ๋งŒ๋“ค๋„๋ก ํ•œ๋‹ค.

image

์ด๋•Œ ๊ฐ gating network๋Š” ๊ฐ„๋‹จํ•œ input_dim์€ feature์ด๊ณ  output_dim์€ num_experts์ธ classifier์ด๋‹ค.

image

synthetic ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ํ‰๊ฐ€๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค. ํƒœ์Šคํฌ๋ณ„ correlation์ด ๋†’์„ ์ˆ˜๋ก

image

real data์— ๋Œ€ํ•œ ํ‰๊ฐ€๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

image image

ํ•œ์ค„ ํ‰ : ํ ..classifier ๋ณ„๋กœ correlation ์ดˆ๊ธฐ๊ฐ’์„ ์ข€ ์ค„ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์„๊นŒ?