[55] Position Prediction as an Effective Pretraining Strategy

August 26, 2022 · 1 min · long8v · 

[25] Intriguing Properties of Vision Transformers

April 29, 2022 · 3 min · long8v · 

[24] DINO: Emerging Properties in Self-Supervised Vision Transformers

April 26, 2022 · 5 min · long8v · 

[5] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

January 13, 2022 · 1 min · long8v · 

[4] Conditional Positional Encodings for Vision Transformers

January 12, 2022 · 1 min · long8v · 

[3] Twins: Revisiting the Design of Spatial Attention in Vision Transformers

January 10, 2022 · 1 min · long8v · 

[2] ELSA: Enhanced Local Self-Attention for Vision Transformer

January 7, 2022 · 1 min · long8v · 

[1] Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

January 5, 2022 · 1 min · long8v ·