Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks Paper • 2510.08002 • Published Oct 9, 2025 • 24
The Denario project: Deep knowledge AI agents for scientific discovery Paper • 2510.26887 • Published Oct 30, 2025 • 8
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2, 2025 • 238
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research Paper • 2509.13312 • Published Sep 16, 2025 • 106
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2, 2025 • 84
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published Sep 1, 2025 • 81
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published Nov 20, 2025 • 110
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 304
QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining Paper • 2602.07085 • Published Feb 6 • 190
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 630
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published Apr 9 • 291
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published Apr 7 • 121
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery Paper • 2604.01658 • Published Apr 2 • 55
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published Apr 29 • 108
FAMA: Failure-Aware Meta-Agentic Framework for Open-Source LLMs in Interactive Tool Use Environments Paper • 2604.25135 • Published Apr 28 • 12
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper • 2604.22748 • Published Apr 24 • 227
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company Paper • 2604.22446 • Published Apr 24 • 121
The Last Human-Written Paper: Agent-Native Research Artifacts Paper • 2604.24658 • Published Apr 29 • 21
SkillOS: Learning Skill Curation for Self-Evolving Agents Paper • 2605.06614 • Published 28 days ago • 46
Auto Research with Specialist Agents Develops Effective and Non-Trivial Training Recipes Paper • 2605.05724 • Published 28 days ago • 15
A^2TGPO: Agentic Turn-Group Policy Optimization with Adaptive Turn-level Clipping Paper • 2605.06200 • Published 28 days ago • 14
STALE: Can LLM Agents Know When Their Memories Are No Longer Valid? Paper • 2605.06527 • Published 28 days ago • 44
ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both Paper • 2605.15198 • Published 21 days ago • 19
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published 16 days ago • 185
OpenComputer: Verifiable Software Worlds for Computer-Use Agents Paper • 2605.19769 • Published 16 days ago • 81
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 23 days ago • 195
On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists Paper • 2605.20668 • Published 15 days ago • 12
MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization Paper • 2605.19330 • Published 16 days ago • 8
QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks Paper • 2605.24218 • Published 13 days ago • 41
AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery Paper • 2605.23204 • Published 13 days ago • 29
Claw-Anything: Benchmarking Always-On Personal Assistants with Broader Access to User's Digital World Paper • 2605.26086 • Published 10 days ago • 23
MemForest: An Efficient Agent Memory System with Hierarchical Temporal Indexing Paper • 2605.23986 • Published 19 days ago • 17
SEAL: Synergistic Co-Evolution of Agents and Learning Environments Paper • 2605.24426 • Published 12 days ago • 10
CoSPlay: Cooperative Self-Play at Test-Time with Self-Generated Code and Unit Test Paper • 2605.23491 • Published 13 days ago • 9
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 8 days ago • 86
ESC-Skills: Discovering and Self-Evolving Skills for Emotional Support Conversations Paper • 2605.27908 • Published 8 days ago • 6
AgensFlow: A Coordination-Policy Substrate for Multi-Agent Systems Paper • 2605.27466 • Published 9 days ago • 8
Verus-SpecGym: An Agentic Environment for Evaluating Specification Autoformalization Paper • 2605.26457 • Published 9 days ago • 6
Advancing Creative Physical Intelligence in Large Multimodal Models Paper • 2605.26396 • Published 10 days ago • 19
AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation Paper • 2605.28655 • Published 8 days ago • 11
Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning Paper • 2605.28424 • Published 8 days ago • 30
When Cloud Agents Meet Device Agents: Lessons from Hybrid Multi-Agent Systems Paper • 2605.30102 • Published 7 days ago • 14
τ_0-WM: A Unified Video-Action World Model for Robotic Manipulation Paper • 2606.01027 • Published 4 days ago
OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents Paper • 2606.02031 • Published 3 days ago • 15
Joint Agent Memory and Exploration Learning via Novelty Signals Paper • 2606.01528 • Published 3 days ago • 12
Skill is Not One-Size-Fits-All: Model-Aware Skill Alignment for LLM Agents Paper • 2605.30723 • Published 6 days ago • 13
A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks Paper • 2605.28556 • Published 8 days ago • 60
Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses Paper • 2606.02373 • Published 3 days ago • 36
SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories Paper • 2606.01311 • Published 4 days ago • 27
When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs Paper • 2605.24202 • Published 13 days ago • 14