AIDB Daily Papers
エージェント型知識追跡:ゲーム内での金融リテラシーを隠密評価するマルチエージェントLLMアーキテクチャ
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- ゲームプレイ中のプレイヤー行動ログを分析し、金融リテラシーを隠密評価するマルチエージェントLLMアーキテクチャを提案した。
- 本研究は、従来の評価手法が学習体験を阻害する課題に対し、ゲームプレイを中断させずに能力を測定する新しいアプローチを提供する。
- 提案手法は、K-12参加者193名を対象とした評価で、学習到達度との有意な相関を示し、単一LLMベースラインの予測精度を約3倍にした。
Abstract
Assessing financial literacy during gameplay without disrupting the learning experience remains a key challenge in serious games for education. We present the Agentic BKT pipeline, a multi-agent large language model architecture for stealth assessment of financial competencies from open-ended gameplay events. The pipeline processes events from a 2D platformer serious game aligned with the OECD/INFE financial literacy framework through four phases: (1) the game captures every player decision as a structured event log; (2) an LLM event classifier labels each action on a four-point rubric validated against three domain experts (Fleiss kappa = 0.624, substantial agreement); (3) four domain-specific agents specializing in risk mitigation, investing, spending, and credit management perform session-level reasoning over behavioral trajectories, feeding per-competency Bayesian Knowledge Tracing that estimates mastery within each domain; and (4) an expert judge agent synthesizes the domain-level estimates into an overall mastery score. Evaluated with 193 K-12 participants across 264 game sessions, the Agentic BKT pipeline yields mastery estimates significantly correlated with learning gain (r = 0.276, p = 0.0001) and post-test scores (r = 0.333, p < 0.0001) while showing no correlation with pre-test scores, providing both convergent and discriminant validity. The multi-agent approach approximately triples the predictive validity of a single-LLM baseline (r = 0.095, not significant) in this study, demonstrating that domain decomposition and session-level reasoning play a central role in capturing the multidimensional nature of financial literacy from gameplay
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: