[173] Detecting and Preventing Hallucinations in Large Vision Language ModelsAAAI RL 2023Q3 MLLM ScaleAI
[164] TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question AnsweringICCV evaluation 2023Q3
[148] I Can't Believe There's No Images! Learning Visual Tasks Using only Language SupervisionICCV 25min CLIP 2023Q3 AI2
[144] Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyondmultilingual alibaba 2023Q3 MLLM qwen