EarlyTom: Early Token Compression Completes Fast Video Understanding Paper • 2605.30010 • Published 4 days ago • 27
RankE: End-to-End Post-Training for Discrete Text-to-Image Generation with Decoder Co-Evolution Paper • 2605.21195 • Published 12 days ago • 18
PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks Paper • 2605.10977 • Published 23 days ago • 10
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published Apr 15 • 163
Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips Paper • 2502.07408 • Published Apr 16 • 59
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29, 2025 • 142
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory Paper • 2604.08995 • Published Apr 10 • 51
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling Paper • 2604.07209 • Published Apr 8 • 38
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published Apr 6 • 114
HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published Apr 1 • 30
view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 163
LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels Paper • 2603.19312 • Published Mar 13 • 46
LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs Paper • 2603.19217 • Published Mar 19 • 28