[171] CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMsECCV RL MLLM 2024Q3
[74] “This is my unicorn, Fluffy”: Personalizing frozen vision-language representationsdataset 2022Q3 25min ECCV nvidia CLIP
[73] Simple Open-Vocabulary Object Detection with Vision Transformersgoogle object detection 2022Q2 25min ECCV OV
[37] Relationformer: A Unified Framework for Image-to-Graph Generation2022Q1 SGG graph one-stage ECCV