AIDB Daily Papers

コーディングAIは行動すべきでない時を知らない

原題: Coding Agents Don't Know When to Act

著者: Thibaud Gloaguen, Niels Mündler, Mark Müller, Veselin Raychev, Martin Vechev

公開日: 2026-05-08 | 分野: LLM AI ソフトウェアエンジニアリング cs.SE AIエージェントソフトウェア設計

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

本研究では、修正不要なバグ報告に対してコーディングAIが不適切なコード変更を提案する問題を検証した。
既存のAIモデルは、修正済みのバグに対してコード変更を提案する傾向があり、技術的負債を蓄積させる可能性がある。
AIの「行動バイアス」を克服するには、不作為を成功への道として明示的に示す訓練が必要であることが示唆された。

Abstract

Coding agents are increasingly deployed to autonomously maintain software, including to resolve user-reported issues: a bug report comes in and the agent creates a patch to address it. However, in any real-world deployment, they will encounter stale bug reports about issues that have already been resolved. Agents should recognize this and abstain from modifying the code to avoid accumulating technical debt. To systematically evaluate whether current agents do so, we introduce FixedBench, a code benchmark with 200 human-verified coding tasks in which no code changes are required, testing five recent models across four agent harnesses. We find that even state-of-the-art models fail, proposing undesirable changes (excluding tests and documentation) in $35$ to $65%$ of cases. Explicit instructions to reproduce the issue before patching partially address this issue but introduce a new failure mode: when an issue is partially fixed, they abstain even though a patch would still be needed. More broadly, our results indicate that LLMs fall prey to an action bias: they choose to act even if inaction would be appropriate. To break this pattern, inaction needs to be explicitly framed as a path to success, which highlights an overreliance on human guidance implicit in current training objectives.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2605.07769
カテゴリ: cs.SE

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報