AIDB Daily Papers

考えすぎは禁物：LLMにおける推論時間の過剰なスケーリングがもたらす弊害

原題: When More Thinking Hurts: Overthinking in LLM Test-Time Compute Scaling

著者: Shu Zhou, Rui Ling, Junan Chen, Xin Wang, Tao Fan, Hao Wang

公開日: 2026-04-12 | 分野: LLM 推論 AI 最適化自然言語処理コスト大規模言語モデル効率

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

大規模言語モデルの推論能力向上のため、推論時間を長くする手法が主流だが、その有効性を検証した。
計算コストが増加するにつれて、追加の推論トークンによる効果が減少し、過剰な推論が正答を放棄する原因となることを発見した。
問題の難易度によって最適な推論時間が異なり、計算資源の均等な配分が非効率であることをコスト意識型評価で示した。

Abstract

Scaling test-time compute through extended chains of thought has become a dominant paradigm for improving large language model reasoning. However, existing research implicitly assumes that longer thinking always yields better results. This assumption remains largely unexamined. We systematically investigate how the marginal utility of additional reasoning tokens changes as compute budgets increase. We find that marginal returns diminish substantially at higher budgets and that models exhibit ``overthinking'', where extended reasoning is associated with abandoning previously correct answers. Furthermore, we show that optimal thinking length varies across problem difficulty, suggesting that uniform compute allocation is suboptimal. Our cost-aware evaluation framework reveals that stopping at moderate budgets can reduce computation significantly while maintaining comparable accuracy.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

💬 ディスカッション

ディスカッションに参加するにはログインが必要です。

ログイン / アカウント作成 →

arxivで読む PDFを開く

メタ情報

arxiv ID: 2604.10739
カテゴリ: cs.AI

ポイント

Abstract

Paper AI Chat

💬 ディスカッション

関連するAIDB記事

メタ情報