AIDB Daily Papers

スキルクラフト：LLMエージェントはツールを巧みに使えるようになるか？

原題: SkillCraft: Can LLM Agents Learn to Use Tools Skillfully?

著者: Shiqi Chen, Jingze Gai, Ruochen Zhou, Jinghan Zhang, Tongyao Zhu, Junlong Li, Kangrui Wang, Zihan Wang, Zhengyu Chen, Klara Kaleb, Ning Miao, Siyang Gao, Cong Lu, Manling Li, Junxian He, Yee Whye Teh

公開日: 2026-02-28 | 分野: LLM 効率化ベンチマーク推論ソフトウェア

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

現実世界のツール利用を模倣し、LLMエージェントが高度なツール合成スキルを獲得・再利用できるかを検証するベンチマークを提案。
既存のベンチマークが静的なツールセットでの成功に焦点を当てるのに対し、本研究は再利用可能なスキル獲得能力の評価に重点を置く点が新しい。
スキル保存と再利用により、トークン使用量を最大80%削減し、成功率とツール合成能力の強い相関関係を確認した。

Abstract

Real-world tool-using agents operate over long-horizon workflows with recurring structure and diverse demands, where effective behavior requires not only invoking atomic tools but also abstracting, and reusing higher-level tool compositions. However, existing benchmarks mainly measure instance-level success under static tool sets, offering limited insight into agents' ability to acquire such reusable skills. We address this gap by introducing SkillCraft, a benchmark explicitly stress-test agent ability to form and reuse higher-level tool compositions, where we call Skills. SkillCraft features realistic, highly compositional tool-use scenarios with difficulty scaled along both quantitative and structural dimensions, designed to elicit skill abstraction and cross-task reuse. We further propose a lightweight evaluation protocol that enables agents to auto-compose atomic tools into executable Skills, cache and reuse them inside and across tasks, thereby improving efficiency while accumulating a persistent library of reusable skills. Evaluating state-of-the-art agents on SkillCraft, we observe substantial efficiency gains, with token usage reduced by up to 80% by skill saving and reuse. Moreover, success rate strongly correlates with tool composition ability at test time, underscoring compositional skill acquisition as a core capability.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2603.00718
カテゴリ: cs.CL, cs.SE

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報