[153] Contrastive Explanations for Model Interpretability

April 1, 2024 ยท 2 min ยท long8v ยท 

[147] Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers

February 7, 2024 ยท 3 min ยท long8v ยท 

[131] Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels

September 13, 2023 ยท 2 min ยท long8v ยท 

[126] ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision

August 9, 2023 ยท 2 min ยท long8v ยท 

feat: add open-clip

June 21, 2023 ยท 1 min ยท long8v ยท 

[106] Prefix-Tuning: Optimizing Continuous Prompts for Generation

March 28, 2023 ยท 1 min ยท long8v ยท 

[5] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

January 13, 2022 ยท 1 min ยท long8v ยท 

[4] Conditional Positional Encodings for Vision Transformers

January 12, 2022 ยท 1 min ยท long8v ยท 

[3] Twins: Revisiting the Design of Spatial Attention in Vision Transformers

January 10, 2022 ยท 1 min ยท long8v ยท