view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective 13 days ago • 51
Enterprise Agents and Benchmarks Collection Enterprise agent ecosystem featuring AssetOpsBench (industrial) and ITBench (SRE, FinOps, CISO), CUGA to accelerate AI Automation • 9 items • Updated 3 days ago • 9
Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published 19 days ago • 54
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering Paper • 2507.11527 • Published Jul 15, 2025 • 35
view article Article The Agent Era Is Here: A Comprehensive Survey of Large Language Model Agents Apr 8, 2025 • 3
TRAJECT-Bench:A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use Paper • 2510.04550 • Published Oct 6, 2025 • 1
ToolDial: Multi-turn Dialogue Generation Method for Tool-Augmented Language Models Paper • 2503.00564 • Published Mar 1, 2025 • 2
ToolRM Collection ToolRM: Towards Agentic Tool-Use Reward Modeling • 6 items • Updated 26 days ago • 4
One Model to Critique Them All: Rewarding Agentic Tool-Use via Efficient Reasoning Paper • 2510.26167 • Published Oct 30, 2025 • 2
OrchDAG: Complex Tool Orchestration in Multi-Turn Interactions with Plan DAGs Paper • 2510.24663 • Published Oct 28, 2025 • 1
The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey Paper • 2404.11584 • Published Apr 17, 2024 • 1
LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls Paper • 2511.09148 • Published Nov 12, 2025 • 18
ToolRM: Outcome Reward Models for Tool-Calling Large Language Models Paper • 2509.11963 • Published Sep 15, 2025 • 4