AIDB Daily Papers

LLMによる科学論文査読：手法、ベンチマーク、信頼性の課題

原題: LLM-Based Scientific Peer Review: Methods, Benchmarks, and Reliability Challenges

著者: Thi Huyen Nguyen, Zahra Ahmadi

公開日: 2026-06-23 | 分野: LLM 自然言語処理 cs.CL 信頼性 AI支援 AI評価

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

本研究は、LLMを用いた科学論文の自動査読手法、ベンチマーク、および信頼性に関する課題をシステムレベルで分析した。
LLMは査読の効率化に貢献する可能性があるが、その信頼性、堅牢性、セキュリティは十分に理解されておらず、新たなリスクも存在する。
本研究は、LLMによる査読を意思決定問題として捉え、堅牢で透明性の高いAI支援評価システムの開発に向けたロードマップを提示する。

Abstract

The rapid growth of scientific submissions has pushed traditional peer review toward its scalability limits, motivating the exploration of large language models (LLMs) as intelligent automated evaluation assistants. Although recent studies show that LLMs can generate fluent critiques and approximate reviewer scores, their reliability, robustness, and security as decision-support systems remain insufficiently understood. This survey offers a systems-level analysis of LLM-based scientific peer review, focusing on two core evaluative functions: critique generation and score prediction. We present a structured taxonomy of modeling approaches (including prompt-based, supervised, retrieval-augmented, and alignment-optimized approaches), and synthesize empirical findings across existing benchmarks. We analyze dataset constraints, evaluation shortcomings, and domain concentration biases that limit current assessment practices. Beyond performance metrics, we identify emerging robustness risks, including prompt injection, data poisoning, retrieval vulnerabilities, and reward hacking, which expose automated review pipelines to strategic manipulation. From a data mining perspective, we outline key open challenges in modeling subjective disagreement and cross-domain generalization. By reframing automated peer review as a high-stakes, multi-objective decision problem, this survey provides a roadmap for developing robust, transparent, and trustworthy AI-assisted scientific evaluation systems.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2606.25057
カテゴリ: cs.CL

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報