AIDB Daily Papers

MemMachine：パーソナライズされたAIエージェントのための真実性を維持する記憶システム

原題: MemMachine: A Ground-Truth-Preserving Memory System for Personalized AI Agents

著者: Shu Wang, Edwin Yu, Oscar Love, Tom Zhang, Tom Wong, Steve Scargall, Charles Fan

公開日: 2026-04-06 | 分野: LLM AI 検索エージェント知識情報検索最適化パーソナライズメモリ自然言語処理

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

MemMachineは、LLMエージェント向けに、会話エピソード全体を保存する新しい記憶システムを提案した。
従来のRAGの課題を克服し、文脈に沿った検索で関連情報を効率的に抽出し、精度と効率を両立する点が新しい。
各種ベンチマークで高い精度と効率を示し、特にGPT-5-miniと最適化プロンプトの組み合わせが最も費用対効果が高いことを発見した。

Abstract

Large Language Model (LLM) agents require persistent memory to maintain personalization, factual continuity, and long-horizon reasoning, yet standard context-window and retrieval-augmented generation (RAG) pipelines degrade over multi-session interactions. We present MemMachine, an open-source memory system that integrates short-term, long-term episodic, and profile memory within a ground-truth-preserving architecture that stores entire conversational episodes and reduces lossy LLM-based extraction. MemMachine uses contextualized retrieval that expands nucleus matches with surrounding context, improving recall when relevant evidence spans multiple dialogue turns. Across benchmarks, MemMachine achieves strong accuracy-efficiency tradeoffs: on LoCoMo it reaches 0.9169 using gpt4.1-mini; on LongMemEvalS (ICLR 2025), a six-dimension ablation yields 93.0 percent accuracy, with retrieval-stage optimizations -- retrieval depth tuning (+4.2 percent), context formatting (+2.0 percent), search prompt design (+1.8 percent), and query bias correction (+1.4 percent) -- outperforming ingestion-stage gains such as sentence chunking (+0.8 percent). GPT-5-mini exceeds GPT-5 by 2.6 percent when paired with optimized prompts, making it the most cost-efficient setup. Compared to Mem0, MemMachine uses roughly 80 percent fewer input tokens under matched conditions. A companion Retrieval Agent adaptively routes queries among direct retrieval, parallel decomposition, or iterative chain-of-query strategies, achieving 93.2 percent on HotpotQA-hard and 92.6 percent on WikiMultiHop under randomized-noise conditions. These results show that preserving episodic ground truth while layering adaptive retrieval yields robust, efficient long-term memory for personalized LLM agents.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

💬 ディスカッション

ディスカッションに参加するにはログインが必要です。

ログイン / アカウント作成 →

arxivで読む PDFを開く

メタ情報

arxiv ID: 2604.04853
カテゴリ: cs.AI

ポイント

Abstract

Paper AI Chat

💬 ディスカッション

関連するAIDB記事

メタ情報