AIDB Daily Papers

LLMの幻覚を解明する：推論、指示、ソース記憶を probes するPRISM

原題: PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

著者: Yuhe Wu, Guangyu Wang, Yuran Chen, Jiatong Zhang, Yutong Zhang, Yujie Chen, Jiaming Shang, Guang Zhang, Zhuang Liu

公開日: 2026-04-18 | 分野: LLM ベンチマーク評価ハルシネーション cs.CL cs.AI

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

LLMの幻覚を生成パイプラインのどの段階で、なぜ発生するのかを診断するために、PRISMという新しいベンチマークを提案しました。
PRISMは、知識不足、知識エラー、推論エラー、指示不遵守エラーの4次元に幻覚を分解し、生成の3段階（記憶、指示、推論）に基づいています。
24のLLMを評価した結果、指示遵守、記憶検索、論理的推論の間には一貫したトレードオフが見られ、幻覚のメカニズム理解を加速させます。

Abstract

As large language models (LLMs) evolve from conversational assistants into agents capable of handling complex tasks, they are increasingly deployed in high-risk domains. However, existing benchmarks largely rely on mixed queries and posterior evaluation, output-level scoring, which quantifies hallucination severity but offers limited insight into where and why hallucinations arise in the generation pipeline. We therefore reformulate hallucination evaluation as a diagnostic problem and propose PRISM, a controlled benchmark that disentangles hallucinations into four dimensions: knowledge missing, knowledge errors, reasoning errors, and instruction-following errors, grounded in three stages of generation (memory, instruction, and reasoning). PRISM contains 9,448 instances across 65 tasks and supports fine-grained, stage-aware diagnostic evaluation. Evaluating 24 mainstream open-source and proprietary LLMs, we uncover consistent trade-offs across instruction following, memory retrieval, and logical reasoning, showing that mitigation strategies often improve specific dimensions at the expense of others. We hope PRISM provides a framework for understanding the specific mechanisms behind LLMs hallucinations, ultimately accelerating the development of trustworthy large language models.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

💬 ディスカッション

ディスカッションに参加するにはログインが必要です。

ログイン / アカウント作成 →

arxivで読む PDFを開く

メタ情報

arxiv ID: 2604.16909
カテゴリ: cs.CL, cs.AI

ポイント

Abstract

Paper AI Chat

💬 ディスカッション

関連するAIDB記事

メタ情報