AIDB Daily Papers

AIは困難な問題に長く時間をかけるが、人間は逆の行動をとる：難易度登録と熟考配分の分離

原題: Humans Disengage, Reasoning Models Persist: Separating Difficulty Registration from Deliberation Allocation

著者: Han-yu Wang

公開日: 2026-06-25 | 分野: LLM AI cs.CL cs.AI AIエージェント AI評価

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

本研究は、AI（LRM）と人間の問題解決における思考時間と正誤の関係性を分析した。
AIは間違えた問題により多くの時間を費やすが、人間は間違えた問題に費やす時間を減らすという逆のパターンを発見した。
この違いは、AIが不確実性から思考を伸ばすのに対し、人間が解けると期待する問題に集中し、そうでない問題から離れる「エンゲージメントと放棄」の戦略をとるためであると考察された。

Abstract

Large reasoning models (LRMs) take longer on harder problems, just as humans do. This surface similarity hides an opposite pattern within items. When an LRM gets a problem wrong, it spends more tokens than when it gets the same problem right; humans do the reverse, spending less time on the trials they get wrong. We separate two levels of deliberation: how response time tracks difficulty across items (registration), and, with item identity held fixed, whether an agent spends more on its own failures or successes (allocation). On a public matched human-LRM corpus, humans and all five thinking LRMs reproduce the known cross-item alignment (registration) but diverge within items (allocation): every LRM shows a large wrong-vs-right effect (Cohen's d = 1.47-3.13 on H-ARC) while humans show the opposite sign. The comparison stays inside each agent's own scale; we never put seconds and tokens on one axis. The dissociation holds under item fixed effects, replicates across datasets, and is absent in a non-thinking baseline. We read the human pattern as engagement versus abandonment: people stay on items they expect to solve and give up on the rest. We read the LRM pattern as length driven by uncertainty: chains grow when the model is unsure, which is exactly when it tends to fail. Both policies produce the same cross-item correlation with difficulty, so they look aligned on the measure prior work has used; the divergence shows up only once item identity is fixed. Under resource-rational metareasoning, the split is between two stopping policies that share a difficulty signal but implement opposite control; trace length captures the signal and misses the control.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2606.26502
カテゴリ: cs.AI, cs.CL

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報