2021Q4 | 🍎 Paper Today I Read 🦔

[189] Training Verifiers to Solve Math Word Problems

2021Q4 openAI 25min RL

[158] A Mathematical Framework for Transformer Circuits

2021Q4 XAI anthropic

[124] LiT: Zero-Shot Transfer with Locked-image text Tuning

2021Q4 google CLIP

[90] Neural Collaborative Graph Machines for Table Structure Recognition

2021Q4 CVPR graph document

[87] Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation

2021Q4 CVPR SGG imbalance

[72] Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity

2021Q4 ICLR object detection sparse kakao

[65] Margin Calibration for Long-Tailed Visual Recognition

2021Q4 25min imbalance ECCV

[63] Masked Autoencoders Are Scalable Vision Learners

2021Q4 SSL 25min

[58] MetaFormer Is Actually What You Need for Vision

2021Q4 backbone 25min

[45] BGT-Net: Bidirectional GRU Transformer Network for Scene Graph Generation

2021Q4 SGG graph

[44] Context-Aware Scene Graph Generation With Seq2Seq Transformers

ICCV 2021Q4 SGG graph

[33] Learning to Prompt for Continual Learning

2021Q4 google CVPR continual learning

[29] Grounded Language-Image Pre-training

multimodal 2021Q4 few-shot zero-shot microsoft object detection

[16] Counterfactual Memorization in Neural Language Models

NLP 2021Q4 privacy LM

[7] SLIP: Self-supervision meets Language-Image Pre-training

multimodal 2021Q4 few-shot SSL

[6] Crossing the Format Boundary of Text and Boxes: Towards Unified Vision-Language Modeling

multimodal 2021Q4 backbone multitask

[2] ELSA: Enhanced Local Self-Attention for Vision Transformer

2021Q4 ViT attention

[1] Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

2021Q4 ViT backbone