feat: add text span

2024๋…„ 5์›” 7์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[156] Interpreting CLIP's Image Representation via Text-Based Decomposition

2024๋…„ 5์›” 6์ผ ยท 2 ๋ถ„ ยท long8v ยท 

[127] Linearly Mapping from Image to Text Space

2023๋…„ 8์›” 17์ผ ยท 2 ๋ถ„ ยท long8v ยท 

[111] Perceiver IO: A General Architecture for Structured Inputs & Outputs

2023๋…„ 4์›” 24์ผ ยท 2 ๋ถ„ ยท long8v ยท 

[110] Understanding the Role of Self Attention for Efficient Speech Recognition

2023๋…„ 4์›” 17์ผ ยท 2 ๋ถ„ ยท long8v ยท 

[83] Variance Networks: When Expectation Does Not Meet Your Expectations

2022๋…„ 11์›” 25์ผ ยท 2 ๋ถ„ ยท long8v ยท 

[72] Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity

2022๋…„ 10์›” 20์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[67] Deformable DETR: Deformable Transformers for End-to-End Object Detection

2022๋…„ 9์›” 21์ผ ยท 3 ๋ถ„ ยท long8v ยท 

[27] Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation

2022๋…„ 5์›” 23์ผ ยท 2 ๋ถ„ ยท long8v ยท 

[21] cosFormer: Rethinking Softmax in Attention

2022๋…„ 4์›” 20์ผ ยท 2 ๋ถ„ ยท long8v ยท 

[20] Memorizing Transformer

2022๋…„ 4์›” 7์ผ ยท 3 ๋ถ„ ยท long8v ยท