image

paper , code

TL;DR

  • I read this because.. : meta-learning. NAS인데 ν•™μŠ΅ μ•ˆν•˜λŠ” κ±°?! Recommended by my advisor
  • task : Neural Architecture Search
  • problem : It takes too much labor to create a deep learning model, so the NAS to solve it eventually has to learn, which makes search too slow.
  • idea : Can we predict the final performance with the initialized model without training? -> Create a code book by dividing the regions that are activated in mini batch N samples and create an N x N matrix with hamming distance between the data.
  • input/output : model -> score(or rank)
  • architecture : NAS-Bench-201 This seems to be CNN-based after all.
  • baseline : NAS(REINFORCE, BOHB) based on cell prediction, NAS(RSPS, …) with weight sharing to reduce search time
  • data : NAS-Bench-201, NDS-DARTS
  • evaluation : performance of CIFAR-10, CIFAR-100, ImageNet-16-120 on best model
  • result : Predictable performance without training. In 30 seconds in a defined search space for CIFAR-10, it was able to find a child in the NAS-Bench-201 search space with 92.81% accuracy
  • contribution : Predicting performance without first learning (?) This is almost an art form.
  • etc. :

Details

  • NAS-BENCH-201 : https://arxiv.org/abs/2001.00326 It seems like a benchmark that completely ignores search space and only measures Rank. image

  • binary activation codes in linear regions image

  • activation Visualize activation codes image

Assumption that the lower the correlation, the better the performance -> in fact, the higher the CIFAR-10 accuracy, the whiter the performance The intuition here is that Assuming that kids with similar binary code will have a harder time distinguishing between samples in a more linear way, and conversely, learning will be easier if the input is well differentiated!

image

score can be written like this image

ablation

  • Positive correlation between score and post-training accuracy image

  • Comparison with other measures. High rank correlation coefficient. image

    1. sample image 2) initialization method 3) verify that ordinal remains the same regardless of bs image
  • Verify that rank is maintained while learning image

  • With the above score, a NAS with ? image

  • Final performance: not SOTA, the search time is very small! image