AIDB Daily Papers
ATANT:AIの継続性を評価するフレームワーク
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- AIシステムの継続性を評価するATANTフレームワークを提案し、時間経過に伴う文脈の維持・更新・明確化・再構築能力を測定する。
- 既存のAIは記憶コンポーネントを持つが、真の継続性を定義・測定するフレームワークは存在せず、ATANTはLLMなしで評価を行う。
- ATANTの評価で、システムは58%から100%に向上し、250の異なる物語が共存するデータベースで96%の精度を達成した。
Abstract
We present ATANT (Automated Test for Acceptance of Narrative Truth), an open evaluation framework for measuring continuity in AI systems: the ability to persist, update, disambiguate, and reconstruct meaningful context across time. While the AI industry has produced memory components (RAG pipelines, vector databases, long context windows, profile layers), no published framework formally defines or measures whether these components produce genuine continuity. We define continuity as a system property with 7 required properties, introduce a 10-checkpoint evaluation methodology that operates without an LLM in the evaluation loop, and present a narrative test corpus of 250 stories comprising 1,835 verification questions across 6 life domains. We evaluate a reference implementation across 5 test suite iterations, progressing from 58% (legacy architecture) to 100% in isolated mode (250 stories) and 100% in 50-story cumulative mode, with 96% at 250-story cumulative scale. The cumulative result is the primary measure: when 250 distinct life narratives coexist in the same database, the system must retrieve the correct fact for the correct context without cross-contamination. ATANT is system-agnostic, model-independent, and designed as a sequenced methodology for building and validating continuity systems. The framework specification, example stories, and evaluation protocol are available at https://github.com/Kenotic-Labs/ATANT. The full 250-story corpus will be released incrementally.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: