[55] Position Prediction as an Effective Pretraining Strategy

2022λ…„ 8μ›” 26일 Β· 1 λΆ„ Β· long8v Β· 

[25] Intriguing Properties of Vision Transformers

2022λ…„ 4μ›” 29일 Β· 2 λΆ„ Β· long8v Β· 

[24] DINO: Emerging Properties in Self-Supervised Vision Transformers

2022λ…„ 4μ›” 26일 Β· 4 λΆ„ Β· long8v Β· 

[5] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

2022λ…„ 1μ›” 13일 Β· 1 λΆ„ Β· long8v Β· 

[4] Conditional Positional Encodings for Vision Transformers

2022λ…„ 1μ›” 12일 Β· 1 λΆ„ Β· long8v Β· 

[3] Twins: Revisiting the Design of Spatial Attention in Vision Transformers

2022λ…„ 1μ›” 10일 Β· 1 λΆ„ Β· long8v Β· 

[2] ELSA: Enhanced Local Self-Attention for Vision Transformer

2022λ…„ 1μ›” 7일 Β· 1 λΆ„ Β· long8v Β· 

[1] Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

2022λ…„ 1μ›” 5일 Β· 1 λΆ„ Β· long8v Β·