The End of Manual Decoding: Towards Truly End-to-End Language Models Paper β’ 2510.26697 β’ Published Oct 30, 2025 β’ 120 β’ 5
DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking Paper β’ 2510.20168 β’ Published Oct 23, 2025 β’ 28 β’ 2
HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application Paper β’ 2510.19631 β’ Published Oct 22, 2025 β’ 28 β’ 2