[26] Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts

idea : Let’s create a multi-gate MoE (MMOE) that can model multi-tasks without having to explicitly specify the relation of each task.

In typical multi-task learning, we have a shared network (shared bottom) and build FCNs for each task on top of it. In this paper, we combine the idea of MoE and use each expert as a shared bottom. In the original MoE, there is a single gating network, but in MMoE, we create a gating network for each task k.