When Search Agents Should Ask: DiscoBench for Clarification-Aware Deep Search Paper • 2606.27669 • Published 9 days ago • 8
SkillCoach: Self-Evolving Rubrics for Evaluating and Enhancing Agentic Skill-Use Paper • 2607.01874 • Published 3 days ago • 14
EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments Paper • 2607.02440 • Published 3 days ago • 41
Prism Transformer: Progressive Head Schedules for Hierarchical Attention Processing Paper • 2606.27449 • Published 10 days ago • 1
Pair-In, Pair-Out: Latent Multi-Token Prediction for Efficient LLMs Paper • 2605.27255 • Published May 29 • 1
Depth-Attention: Cross-Layer Value Mixing for Language Models Paper • 2606.05014 • Published Jun 3 • 1
Transformers with Selective Access to Early Representations Paper • 2605.03953 • Published May 6 • 1
A Dual-Path Architecture for Scaling Compute and Capacity in LLMs Paper • 2605.30202 • Published May 28 • 1
MOPD: Multi-Teacher On-Policy Distillation for Capability Integration in LLM Post-Training Paper • 2606.30406 • Published 6 days ago • 12
AutoTrainess: Teaching Language Models to Improve Language Models Autonomously Paper • 2606.31551 • Published 5 days ago • 14
CausalMix: Data Mixture as Causal Inference for Language Model Training Paper • 2607.01104 • Published 4 days ago • 17
One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining Paper • 2606.30634 • Published 6 days ago • 23