AIDB Daily Papers
AIは困難な問題に長く時間をかけるが、人間は逆の行動をとる:難易度登録と熟考配分の分離
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- 本研究は、AI(LRM)と人間の問題解決における思考時間と正誤の関係性を分析した。
- AIは間違えた問題により多くの時間を費やすが、人間は間違えた問題に費やす時間を減らすという逆のパターンを発見した。
- この違いは、AIが不確実性から思考を伸ばすのに対し、人間が解けると期待する問題に集中し、そうでない問題から離れる「エンゲージメントと放棄」の戦略をとるためであると考察された。
Abstract
Large reasoning models (LRMs) take longer on harder problems, just as humans do. This surface similarity hides an opposite pattern within items. When an LRM gets a problem wrong, it spends more tokens than when it gets the same problem right; humans do the reverse, spending less time on the trials they get wrong. We separate two levels of deliberation: how response time tracks difficulty across items (registration), and, with item identity held fixed, whether an agent spends more on its own failures or successes (allocation). On a public matched human-LRM corpus, humans and all five thinking LRMs reproduce the known cross-item alignment (registration) but diverge within items (allocation): every LRM shows a large wrong-vs-right effect (Cohen's d = 1.47-3.13 on H-ARC) while humans show the opposite sign. The comparison stays inside each agent's own scale; we never put seconds and tokens on one axis. The dissociation holds under item fixed effects, replicates across datasets, and is absent in a non-thinking baseline. We read the human pattern as engagement versus abandonment: people stay on items they expect to solve and give up on the rest. We read the LRM pattern as length driven by uncertainty: chains grow when the model is unsure, which is exactly when it tends to fail. Both policies produce the same cross-item correlation with difficulty, so they look aligned on the measure prior work has used; the divergence shows up only once item identity is fixed. Under resource-rational metareasoning, the split is between two stopping policies that share a difficulty signal but implement opposite control; trace length captures the signal and misses the control.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: