AIDB Daily Papers
LLMの目標選択は、オープンエンドなタスクにおいて人間のそれとは異なる
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- 大規模言語モデル(LLM)の目標選択における人間との類似性を検証するため、認知科学のタスクを用いた。
- GPT-5, Gemini 2.5 Proなど4つのモデルは、人間とは大きく異なり、単一の解決策に固執する傾向が見られた。
- 人間の目標選択の独自性が示唆され、個人アシスタントなどへの安易な代替に警鐘を鳴らしている。
Abstract
As large language models (LLMs) get integrated into human decision-making, they are increasingly choosing goals autonomously rather than only completing human-defined ones, assuming they will reflect human preferences. However, human-LLM similarity in goal selection remains largely untested. We directly assess the validity of LLMs as proxies for human goal selection in a controlled, open-ended learning task borrowed from cognitive science. Across four state-of-the-art models (GPT-5, Gemini 2.5 Pro, Claude Sonnet 4.5, and Centaur), we find substantial divergence from human behavior. While people gradually explore and learn to achieve goals with diversity across individuals, most models exploit a single identified solution (reward hacking) or show surprisingly low performance, with distinct patterns across models and little variability across instances of the same model. Even Centaur, explicitly trained to emulate humans in experimental settings, poorly captures people's goal selection. Chain-of-thought reasoning and persona steering provide limited improvements. These findings highlight the uniqueness of human goal selection, cautioning against replacing it with current models in applications such as personal assistance, scientific discovery, and policy research.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: