AIDB Daily Papers
AAAI-26におけるAI支援による大規模査読実験:AIは査読をどう変えるか?
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- AAAI-26の全投稿論文に対し、最先端AIシステムによる査読を導入し、大規模なAI支援査読のフィールド実験を実施した。
- AIレビューは技術的な正確性や研究提案において、人間のレビューよりも好まれるという結果が得られ、査読の質と効率向上が期待される。
- AIシステムは、論文の弱点を検出する能力において、単純なLLMによるレビューを大幅に上回り、実用的な貢献を示した。
Abstract
Scientific peer review faces mounting strain as submission volumes surge, making it increasingly difficult to sustain review quality, consistency, and timeliness. Recent advances in AI have led the community to consider its use in peer review, yet a key unresolved question is whether AI can generate technically sound reviews at real-world conference scale. Here we report the first large-scale field deployment of AI-assisted peer review: every main-track submission at AAAI-26 received one clearly identified AI review from a state-of-the-art system. The system combined frontier models, tool use, and safeguards in a multi-stage process to generate reviews for all 22,977 full-review papers in less than a day. A large-scale survey of AAAI-26 authors and program committee members showed that participants not only found AI reviews useful, but actually preferred them to human reviews on key dimensions such as technical accuracy and research suggestions. We also introduce a novel benchmark and find that our system substantially outperforms a simple LLM-generated review baseline at detecting a variety of scientific weaknesses. Together, these results show that state-of-the-art AI methods can already make meaningful contributions to scientific peer review at conference scale, opening a path toward the next generation of synergistic human-AI teaming for evaluating research.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: