AIDB Daily Papers
論文からコード生成・実行までを階層型マルチエージェントで自動化するHiRAS
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- 論文からコード生成・実行までを階層型マルチエージェントで自動化するHiRASを提案した。
- 既存手法の弱点を克服し、研究再現性の自動化における頑健性と性能を向上させる。
- Paper2Codeベンチマークの評価プロトコルを改善し、既存手法を上回る性能と幻覚の低減を確認した。
Abstract
Recent advances in large language models have highlighted their potential to automate computational research, particularly reproducing experimental results. However, existing approaches still use fixed sequential agent pipelines with weak global coordination, which limits their robustness and overall performance. In this work, we propose Hierarchical Research Agent System (HiRAS), a hierarchical multi-agent framework for end-to-end experiment reproduction that employs supervisory manager agents to coordinate specialised agents across fine-grained stages. We also identify limitations in the reference-free evaluation of the Paper2Code benchmark and introduce Paper2Code-Extra (P2C-Ex), a refined protocol that incorporates repository-level information and better aligns with the original reference-based metric. We conduct extensive evaluation, validating the effectiveness and robustness of our proposed methods, and observing improvements, including >10% relative performance gain beyond the previous state-of-the-art using open-source backbone models and significantly reduced hallucination in evaluation. Our work is available on GitHub: https://github.com/KOU-199024/HiRAS.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: