PerceptionRubrics: Calibrating Multimodal Evaluation to Human Perception Paper • 2606.28322 • Published 8 days ago • 36
v2.0 — GLM-5.1/DeepSeek (Distilled) Collection Collection of GLM-5.1 or DeepSeek reasoning-distilled, Qwen3.5 • 4 items • Updated 1 day ago • 1
view article Article ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration ibm-research • 3 days ago • 24
view article Article Kog Laneformer 2B: The Latency-First Model Behind Kog Inference Engine kogai • 9 days ago • 32
v2.0 — GLM-5.1/DeepSeek (Distilled) Collection Collection of GLM-5.1 or DeepSeek reasoning-distilled, Qwen3.5 • 4 items • Updated 1 day ago • 1
v2.0 — GLM-5.1/DeepSeek (Distilled) Collection Collection of GLM-5.1 or DeepSeek reasoning-distilled, Qwen3.5 • 4 items • Updated 1 day ago • 1