Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving Paper • 2512.10739 • Published Dec 11, 2025 • 47
Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning Paper • 2512.10534 • Published Dec 11, 2025 • 32
OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification Paper • 2512.10756 • Published Dec 11, 2025 • 35
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding Paper • 2406.14515 • Published Jun 20, 2024 • 33
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs Paper • 2406.14544 • Published Jun 20, 2024 • 35