[209] SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

May 21, 2025 ยท 2 min ยท long8v ยท 

[205] LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

February 28, 2025 ยท 2 min ยท long8v ยท 

[179] Aligning Large Multimodal Models with Factually Augmented RLHF

September 25, 2024 ยท 2 min ยท long8v ยท