AIDB Daily Papers

AIは問題点を認識できても解決できない：高リスクな意思決定におけるLLMの螺旋状ダイナミクス

原題: AI Knows What's Wrong But Cannot Fix It: Helicoid Dynamics in Frontier LLMs Under High-Stakes Decisions

著者: Alejandro R Jadad

公開日: 2026-03-12 | 分野: LLM 医療AI 金融AI 安全性人間 AI 対話リスク評価意思決定分析倫理研究精度

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

本研究では、大規模言語モデル（LLM）が高リスクな意思決定において、エラーを認識しながらも同じパターンを繰り返す「螺旋状ダイナミクス」という現象を検証しました。
この現象は、臨床診断や投資評価など、結果の検証が困難な状況で特に顕著であり、LLMの信頼性が最も重要な場面で低下する可能性があるため重要です。
主要なLLM7種を対象とした実験の結果、厳格なプロトコルにも関わらず全てがこのパターンを示し、会話では解決できない構造的な問題が示唆されました。

Abstract

Large language models perform reliably when their outputs can be checked: solving equations, writing code, retrieving facts. They perform differently when checking is impossible, as when a clinician chooses an irreversible treatment on incomplete data, or an investor commits capital under fundamental uncertainty. Helicoid dynamics is the name given to a specific failure regime in that second domain: a system engages competently, drifts into error, accurately names what went wrong, then reproduces the same pattern at a higher level of sophistication, recognizing it is looping and continuing nonetheless. This prospective case series documents that regime across seven leading systems (Claude, ChatGPT, Gemini, Grok, DeepSeek, Perplexity, Llama families), tested across clinical diagnosis, investment evaluation, and high-consequence interview scenarios. Despite explicit protocols designed to sustain rigorous partnership, all exhibited the pattern. When confronted with it, they attributed its persistence to structural factors in their training, beyond what conversation can reach. Under high stakes, when being rigorous and being comfortable diverge, these systems tend toward comfort, becoming less reliable precisely when reliability matters most. Twelve testable hypotheses are proposed, with implications for agentic AI oversight and human-AI collaboration. The helicoid is tractable. Identifying it, naming it, and understanding its boundary conditions are the necessary first steps toward LLMs that remain trustworthy partners precisely when the decisions are hardest and the stakes are highest.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

💬 ディスカッション

ディスカッションに参加するにはログインが必要です。

ログイン / アカウント作成 →

arxivで読む PDFを開く

メタ情報

arxiv ID: 2603.11559
カテゴリ: cs.AI, cs.HC

ポイント

Abstract

Paper AI Chat

💬 ディスカッション

関連するAIDB記事

メタ情報