[194] Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

January 3, 2025 · 4 min · long8v · 

[190] Solving math word problems with process and outcome-based feedback

December 16, 2024 · 4 min · long8v · 

[134] Asynchronous Methods for Deep Reinforcement Learning

October 18, 2023 · 4 min · long8v · 

[116] Data Distributional Properties Drive Emergent In-Context Learning in Transformers

May 22, 2023 · 3 min · long8v · 

[109] 🦩 Flamingo: a Visual Language Model for Few-Shot Learning

April 10, 2023 · 4 min · long8v · 

[40] Neural Discrete Representation Learning

July 30, 2022 · 1 min · long8v · 

[23] Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

April 25, 2022 · 3 min · long8v ·