DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents Paper β’ 2605.04808 β’ Published 30 days ago β’ 20
Synthetic Sandbox for Training Machine Learning Engineering Agents Paper β’ 2604.04872 β’ Published Apr 6 β’ 14
Preference Optimization with Multi-Sample Comparisons Paper β’ 2410.12138 β’ Published Oct 16, 2024
Scaling Agent Learning via Experience Synthesis Paper β’ 2511.03773 β’ Published Nov 5, 2025 β’ 83
From Uncertainty to Trust: Enhancing Reliability in Vision-Language Models with Uncertainty-Guided Dropout Decoding Paper β’ 2412.06474 β’ Published Dec 9, 2024
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment Paper β’ 2501.09620 β’ Published Jan 16, 2025
S'MoRE: Structural Mixture of Residual Experts for LLM Fine-tuning Paper β’ 2504.06426 β’ Published Apr 8, 2025 β’ 2
CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning Paper β’ 2503.19900 β’ Published Mar 25, 2025
Boosting LLM Reasoning via Spontaneous Self-Correction Paper β’ 2506.06923 β’ Published Jun 7, 2025
RecoWorld: Building Simulated Environments for Agentic Recommender Systems Paper β’ 2509.10397 β’ Published Sep 12, 2025 β’ 7
StreamMem: Query-Agnostic KV Cache Memory for Streaming Video Understanding Paper β’ 2508.15717 β’ Published Aug 21, 2025 β’ 1
Let it Calm: Exploratory Annealed Decoding for Verifiable Reinforcement Learning Paper β’ 2510.05251 β’ Published Oct 6, 2025 β’ 8
Thought Communication in Multiagent Collaboration Paper β’ 2510.20733 β’ Published Oct 23, 2025 β’ 15
Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models Paper β’ 2508.02120 β’ Published Aug 4, 2025 β’ 20
HumanMM: Global Human Motion Recovery from Multi-shot Videos Paper β’ 2503.07597 β’ Published Mar 10, 2025 β’ 2