AIDB Daily Papers
事故を起こさないエージェントファーストなコード:人間のコードエントロピーを削減し、最先端モデルの要件を30~500倍低減
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- 本研究では、ルーチン的な製品ソフトウェアを、証拠を持つ基盤上で「エージェントファースト・カノニカルコード」へと書き換える手法を提案する。
- これにより、コードの冗長性を排除し、明示的な証拠と検証義務を持つ管理された表現に集約することで、ソフトウェア開発のコストと複雑性を大幅に削減することを目指す。
- 提案手法は、開発コストを100倍削減する可能性を示唆するが、その効果はソフトウェアの種類に依存し、さらなる検証が必要である。
Abstract
Frontier coding models may spend substantial capacity learning not only program behavior, but also accidental entropy in human repositories. Such repositories contain valuable signals: tests, incidents, migrations, edge cases, product judgment, and operational history. These signals are entangled with framework churn, naming drift, generated-source ambiguity, dependency rituals, CI dialects, weak proof routes, and human-oriented review customs. We propose agent-first canonical code, a proof-carrying substrate that rewrites routine product software into canonical behavior profiles, typed change algebra, proof lanes, constrained edit grammars, semantic patch cells, runtime negative memory, and proof-carrying change objects. The core hypothesis is that quotienting software by behavior equivalence under a declared oracle can collapse equivalent encodings into governed representatives with explicit evidence and proof obligations. The endpoint is amortized cost per verified correct change, including source, context, reasoning, tools, verification, security, provenance, review, failed loops, defects, and foundry cost under a common oracle. Reported reduction bands are hypotheses, not measured frontier results. The proposed limit is a No-Accident Horizon: removable accident decreases until residual novelty, evidence, governance, risk, and future optionality dominate. For supported routine-product distributions, this gives a defensible planning target near 100-fold all-in cost reduction, not a guarantee for all software. Preliminary QLoRA experiments on Qwen2.5-Coder-14B show that 64,088 canonical trajectories are learnable and suppress tested forbidden-language markers, but do not establish behavior preservation, scaling economics, or verified-change cost. The contribution is a falsifiable program centered on minimum functional description length and verified-change cost.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: