AIDB Daily Papers

LLM裁判官/陪審員による精神病患者への応答安全性評価のスケーラブルな臨床検証

原題: Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating Psychosis

著者: May Lynn Reese, Markela Zeneli, Mindy Ng, Jacob Haimes, Andreea Damien, Elizabeth Stade

公開日: 2026-03-20 | 分野: LLM NLP 医療AI 安全性 AI 評価自然言語処理メンタルヘルス人工知能大規模言語モデル臨床精神保健

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

大規模言語モデルが精神疾患患者に悪影響を及ぼす可能性に着目し、安全性評価の必要性を提唱。
臨床的妥当性と評価のスケーラビリティ不足を解消するため、LLMを評価者として活用する手法を検証。
LLM裁判官は人間の判断と高い一致度を示し、精神保健分野におけるLLMの安全性評価に有望な結果を示す。

Abstract

General-purpose Large Language Models (LLMs) are becoming widely adopted by people for mental health support. Yet emerging evidence suggests there are significant risks associated with high-frequency use, particularly for individuals suffering from psychosis, as LLMs may reinforce delusions and hallucinations. Existing evaluations of LLMs in mental health contexts are limited by a lack of clinical validation and scalability of assessment. To address these issues, this research focuses on psychosis as a critical condition for LLM safety evaluation by (1) developing and validating seven clinician-informed safety criteria, (2) constructing a human-consensus dataset, and (3) testing automated assessment using an LLM as an evaluator (LLM-as-a-Judge) or taking the majority vote of several LLM judges (LLM-as-a-Jury). Results indicate that LLM-as-a-Judge aligns closely with the human consensus (Cohen's $κ_{text{human} times text{gemini}} = 0.75$, $κ_{text{human} times text{qwen}} = 0.68$, $κ_{text{human} times text{kimi}} = 0.56$) and that the best judge slightly outperforms LLM-as-a-Jury (Cohen's $κ_{text{human} times text{jury}} = 0.74$). Overall, these findings have promising implications for clinically grounded, scalable methods in LLM safety evaluations for mental health contexts.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

💬 ディスカッション

ディスカッションに参加するにはログインが必要です。

ログイン / アカウント作成 →

arxivで読む PDFを開く

メタ情報

arxiv ID: 2604.02359
カテゴリ: cs.CL, cs.AI

ポイント

Abstract

Paper AI Chat

💬 ディスカッション

関連するAIDB記事

メタ情報