view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 Mar 10 • 132
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper • 2601.02151 • Published Jan 5 • 113
Why Distillation can Outperform Zero-RL: The Role of Flexible Reasoning Paper • 2505.21067 • Published May 27, 2025 • 3