AIDB Daily Papers

質問する？推測する？：不確実性を考慮したコーディングエージェントの明確化要求

原題: Ask or Assume? Uncertainty-Aware Clarification-Seeking in Coding Agents

著者: Nicholas Edwards, Sebastian Schuster

公開日: 2026-03-27 | 分野: LLM 機械学習ソフトウェアエージェント評価質問応答タスクコーディング自然言語処理性能

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

本研究では、LLMエージェントの不明確な指示に対する明確化要求能力を体系的に評価しました。
不確実性認識型マルチエージェントシステムを提案し、不明確さの検出とコード実行を分離することで性能向上を目指しました。
OpenHands + Claude Sonnet 4.5を用いたシステムは、タスク解決率69.40%を達成し、既存手法を大幅に上回る結果となりました。

Abstract

As Large Language Model (LLM) agents are increasingly deployed in open-ended domains like software engineering, they frequently encounter underspecified instructions that lack crucial context. While human developers naturally resolve underspecification by asking clarifying questions, current agents are largely optimized for autonomous execution. In this work, we systematically evaluate the clarification-seeking abilities of LLM agents on an underspecified variant of SWE-bench Verified. We propose an uncertainty-aware multi-agent scaffold that explicitly decouples underspecification detection from code execution. Our results demonstrate that this multi-agent system using OpenHands + Claude Sonnet 4.5 achieves a 69.40% task resolve rate, significantly outperforming a standard single-agent setup (61.20%) and closing the performance gap with agents operating on fully specified instructions. Furthermore, we find that the multi-agent system exhibits well-calibrated uncertainty, conserving queries on simple tasks while proactively seeking information on more complex issues. These findings indicate that current models can be turned into proactive collaborators, where agents independently recognize when to ask questions to elicit missing information in real-world, underspecified tasks.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

💬 ディスカッション

ディスカッションに参加するにはログインが必要です。

ログイン / アカウント作成 →

arxivで読む PDFを開く

メタ情報

arxiv ID: 2603.26233
カテゴリ: cs.CL

ポイント

Abstract

Paper AI Chat

💬 ディスカッション

関連するAIDB記事

メタ情報