[178] RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V TrustworthinessRL MLLM 2024Q2
[172] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human FeedbackCVPR RL MLLM 2024Q2
[170] Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference FeedbackRL AI2 2024Q2
[157] LeGrad: An Explainability Method for Vision Transformers via Feature Formation SensitivityCLIP XAI 2024Q2
[155] Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratingsgoogle evaluation generation 2024Q2
[154] Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignmentgoogle XAI evaluation 2024Q2