[57] Learning Transferable Architectures for Scalable Image Recognition

TL;DR

task : image classification, object detection
PROBLEM : Too much architectural engineering to train a neural network well!
idea : Let’s find a building block for small data within the network and transfer it for large data.
architecture : The RNN Controller takes the output of the previous two layers and chooses which layer’s output to receive, and which convs to stack on that layer. When making the selection, we used reinforcement learning in the previous base NAS study, but in this study we randomized because the performance drop is small even with randomization.
objective : image classification loss, object detection loss
baseline : hand-crafted SOTA models(DenseNet, Shake-Shake, MobileNet, ShuffleNet), NAS v3
data : CIFAR-10, ImageNet, COCO
result : Image classification / object detection SOTA with smaller computational cost.
contribution : NAS more efficient architecture (random search, trained by ImageNet with architecture selected by CIFAR-10), but better performance
Limitations or things I don’t understand :