AIDB Daily Papers
FutureWorld:現実世界の報酬で予測エージェントを訓練するライブ環境
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- 現実世界の出来事を予測し、その結果から学習するエージェントを訓練するためのライブ環境「FutureWorld」を提案した。
- 予測タスクを統一的な学習環境として捉えることで、多様な現実世界のイベントに基づいた大量の予測問題を提供し、回答漏洩を防ぐことが可能となった。
- 3つのオープンソースモデルをFutureWorldで訓練した結果、効果が確認され、フロンティアエージェントの性能ベースラインを確立した。
Abstract
Live future prediction refers to the task of making predictions about real-world events before they unfold. This task is increasingly studied using large language model-based agent systems, and it is important for building agents that can continually learn from real-world. Just as interactive environments have often driven progress in agents, advancing live future prediction naturally motivates viewing it as a learning environment. Prior works have explored future prediction from several different parts, but have generally not framed it as a unified learning environment. This task is appealing for learning because it can provide a large number of prediction questions grounded in diverse real-world events, while preventing answer leakage. To leverage the advantages of live future prediction, we present FutureWorld, a live agentic reinforcement learning environment that closes the training loop between prediction, outcome realization, and parameters update. In our environment, we take three open-source base models and train them for consecutive days. The results show that training is effective. Furthermore, we build a daily benchmark based on the environment and evaluate several frontier agents on it to establish performance baselines for current agent systems.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: