[194] Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model ParametersDeepMind 2024Q3 reasoning
[183] MultiMath: Bridging Visual and Mathematical Reasoning for Large Language ModelsMLLM 2024Q3 STEM
[171] CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMsECCV RL MLLM 2024Q3