A Tale of Tails: Model Collapse as a Change of Scaling Laws
Paper
• 2402.07043
• Published
• 15
What Characterizes Effective Reasoning? Revisiting Length, Review, and
Structure of CoT
Paper
• 2509.19284
• Published
• 23
OnePiece: Bringing Context Engineering and Reasoning to Industrial
Cascade Ranking System
Paper
• 2509.18091
• Published
• 34
Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM
Paper
• 2509.18058
• Published
• 12
Igniting Creative Writing in Small Language Models: LLM-as-a-Judge
versus Multi-Agent Refined Rewards
Paper
• 2508.21476
• Published
• 3
Competition Report: Finding Universal Jailbreak Backdoors in Aligned
LLMs
Paper
• 2404.14461
• Published
• 3
Universal Jailbreak Backdoors from Poisoned Human Feedback
Paper
• 2311.14455
• Published
• 3
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world
Markets?
Paper
• 2510.02209
• Published
• 56
Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming
Attacks
Paper
• 2510.02286
• Published
• 29