[164] TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question AnsweringICCV evaluation 2023Q3
[149] Noise-aware Learning from Web-crawled Image-Text Data for Image CaptioningICCV 25min 2022Q4 kakao
[148] I Can't Believe There's No Images! Learning Visual Tasks Using only Language SupervisionICCV 25min CLIP 2023Q3 AI2
[147] Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder TransformersICCV 2021Q1 XAI
[38] Visual Relationship Detection Using Part-and-Sum Transformers with Composite QueriesICCV 2021Q2 SGG one-stage