TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions Paper • 2602.08711 • Published 4 days ago • 25
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning Paper • 2602.10560 • Published 2 days ago • 25
F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare Paper • 2602.06717 • Published 7 days ago • 68
Video-As-Prompt: Unified Semantic Control for Video Generation Paper • 2510.20888 • Published Oct 23, 2025 • 50
No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs Paper • 2602.02103 • Published 11 days ago • 68
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published 14 days ago • 177
Towards Pixel-Level VLM Perception via Simple Points Prediction Paper • 2601.19228 • Published 17 days ago • 17
World Craft: Agentic Framework to Create Visualizable Worlds via Text Paper • 2601.09150 • Published about 1 month ago • 20