AIDB Daily Papers

LLM生成の偽情報：表面的な判断を超えた、人間基準のリスク評価

原題: Beyond Surface Judgments: Human-Grounded Risk Evaluation of LLM-Generated Disinformation

著者: Zonghuan Xu, Xiang Zheng, Yutao Wu, Xingjun Ma

公開日: 2026-04-08 | 分野: LLM 人間機械学習 AI リスク評価テキスト分析自然言語処理大規模言語モデル偽情報

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

大規模言語モデル(LLM)が悪用されるリスクを評価するため、読者の反応をLLMで代用する妥当性を検証しました。
LLM評価者は人間と異なり、論理性を重視し感情的な表現を厳しく評価する傾向があり、読者の判断基準との乖離が課題です。
LLM評価者間の意見は一致しやすいものの、人間との一致度は低く、内部一致は読者反応の妥当な指標とは言えません。

Abstract

Large language models (LLMs) can generate persuasive narratives at scale, raising concerns about their potential use in disinformation campaigns. Assessing this risk ultimately requires understanding how readers receive such content. In practice, however, LLM judges are increasingly used as a low-cost substitute for direct human evaluation, even though whether they faithfully track reader responses remains unclear. We recast evaluation in this setting as a proxy-validity problem and audit LLM judges against human reader responses. Using 290 aligned articles, 2,043 paired human ratings, and outputs from eight frontier judges, we examine judge--human alignment in terms of overall scoring, item-level ordering, and signal dependence. We find persistent judge--human gaps throughout. Relative to humans, judges are typically harsher, recover item-level human rankings only weakly, and rely on different textual signals, placing more weight on logical rigour while penalizing emotional intensity more strongly. At the same time, judges agree far more with one another than with human readers. These results suggest that LLM judges form a coherent evaluative group that is much more aligned internally than it is with human readers, indicating that internal agreement is not evidence of validity as a proxy for reader response.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

💬 ディスカッション

ディスカッションに参加するにはログインが必要です。

ログイン / アカウント作成 →

arxivで読む PDFを開く

メタ情報

arxiv ID: 2604.06820
カテゴリ: cs.AI

ポイント

Abstract

Paper AI Chat

💬 ディスカッション

関連するAIDB記事

メタ情報