LLaDA2.1: Speeding Up Text Diffusion via Token Editing Paper • 2602.08676 • Published 25 days ago • 68
pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 7 items • Updated 8 days ago • 85
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 • 121
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper • 2512.15603 • Published Dec 17, 2025 • 66
view article Article The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator Dec 17, 2025 • 47
view article Article Provence: efficient and robust context pruning for retrieval-augmented generation Jan 28, 2025 • 25
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper • 2510.14528 • Published Oct 16, 2025 • 118
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention Paper • 2510.04212 • Published Oct 5, 2025 • 26
Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models Paper • 2510.03561 • Published Oct 3, 2025 • 25
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights Paper • 2509.22944 • Published Sep 26, 2025 • 80
view article Article Introducing Pivotal Token Search (PTS): Targeting Critical Decision Points in LLM Training May 17, 2025 • 12
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9, 2025 • 785
view article Article Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU! Apr 21, 2024 • 44