AIDB Daily Papers
AIエージェントの「ハーネス」とは何か?定義と境界線を明確にする研究
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- 本研究は、AIエージェントをラップし、リポジトリ上で動作可能にする「エージェントハーネス」の定義を、その起源から概念分析を通じて再構築した。
- 既存の用語の曖昧さを解消し、エージェントフレームワーク等との境界線を明確にするための、必要十分条件を含む運用可能な定義を提案した点が重要である。
- 提案された定義に基づき、実際の6つのエージェントハーネスを分類・検証し、工学実践と科学的比較を導くための共有語彙を提供した。
Abstract
The term agent harness now circulates widely in software engineering with generative artificial intelligence. It names the layer that wraps a language model and turns it into a coding agent able to act on a repository. The usage is loose and polysemous. Sometimes the term denotes the whole product (Claude Code, Codex CLI); sometimes it denotes the evaluation scaffold that runs an agent against tasks (the SWE-bench harness); sometimes it gets conflated with an agent framework, an SDK, an IDE plugin, or an orchestrator. What is missing is a reference definition that works as an instrument, one that includes and excludes cases consistently. We build that definition through a conceptual analysis that combines works with persistent identifiers and primary grey-literature sources, such as official documentation, glossaries, and engineering reports. We reconstruct the genealogy of the term, from the horse's tack to the classic test harness, to the machine-learning evaluation harness, and finally to the agent harness. We then propose a constitutive definition that states the necessary and sufficient conditions for a system to be an agent harness, we operationalize it as an inclusion and exclusion test, and we draw the boundary of the concept against an agent framework, an agent SDK, an IDE plugin, an eval harness, and an orchestrator. We apply the definition to six real harnesses (Claude Code, Codex CLI, Aider, Cline, OpenHands, and SWE-agent) and to deliberate edge cases; the test includes and excludes consistently. We close with a research agenda organized by design tension axes. The contribution is an operational definition of agent harness, with a shared vocabulary, able to guide engineering practice and the scientific comparison of agentic systems.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: