[189] Training Verifiers to Solve Math Word Problems

2024๋…„ 12์›” 9์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[158] A Mathematical Framework for Transformer Circuits

2024๋…„ 5์›” 9์ผ ยท 3 ๋ถ„ ยท long8v ยท 

[124] LiT: Zero-Shot Transfer with Locked-image text Tuning

2023๋…„ 7์›” 6์ผ ยท 3 ๋ถ„ ยท long8v ยท 

[90] Neural Collaborative Graph Machines for Table Structure Recognition

2022๋…„ 12์›” 22์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[87] Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation

2022๋…„ 12์›” 8์ผ ยท 2 ๋ถ„ ยท long8v ยท 

[72] Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity

2022๋…„ 10์›” 20์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[65] Margin Calibration for Long-Tailed Visual Recognition

2022๋…„ 9์›” 19์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[63] Masked Autoencoders Are Scalable Vision Learners

2022๋…„ 9์›” 7์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[58] MetaFormer Is Actually What You Need for Vision

2022๋…„ 8์›” 31์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[45] BGT-Net: Bidirectional GRU Transformer Network for Scene Graph Generation

2022๋…„ 8์›” 3์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[44] Context-Aware Scene Graph Generation With Seq2Seq Transformers

2022๋…„ 8์›” 2์ผ ยท 2 ๋ถ„ ยท long8v ยท 

[16] Counterfactual Memorization in Neural Language Models

2022๋…„ 3์›” 25์ผ ยท 3 ๋ถ„ ยท long8v ยท 

[7] SLIP: Self-supervision meets Language-Image Pre-training

2022๋…„ 1์›” 20์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[6] Crossing the Format Boundary of Text and Boxes: Towards Unified Vision-Language Modeling

2022๋…„ 1์›” 18์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[2] ELSA: Enhanced Local Self-Attention for Vision Transformer

2022๋…„ 1์›” 7์ผ ยท 1 ๋ถ„ ยท long8v ยท 

[1] Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

2022๋…„ 1์›” 5์ผ ยท 1 ๋ถ„ ยท long8v ยท