AIDB Daily Papers
AIエージェント間の連携を劇的に効率化する「TFlow」:重み更新による新コミュニケーション手法
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- 本研究は、AIエージェント間の連携において、従来のテキスト交換に代わる新しい通信インターフェース「TFlow」を提案した。
- TFlowは、エージェントの隠れ状態を一時的な重み摂動に変換することで、計算コストとメモリ使用量を大幅に削減し、効率的な連携を実現する点で重要である。
- 実験の結果、TFlowは複数のベンチマークで精度を向上させつつ、処理トークン数と推論時間を劇的に削減することに成功した。
Abstract
Multi-agent LLM systems usually collaborate by exchanging natural-language messages. This interface is simple and interpretable, but it forces each sender's intermediate computation to be serialized into tokens and then reprocessed by the receiver, thereby increasing the generated-token cost, prefill overhead, and KV-cache memory. We study an alternative communication interface: instead of appending a sender's message to the receiver's context, compile the sender's hidden states into a transient, receiver-specific weight perturbation. We introduce TFlow (Thought Flow), a weight-space communication framework for a known and fixed receiver architecture. For each query, frozen role-prompted sender agents process the input, and a learned parameter generator maps their internal activations into low-rank LoRA perturbations targeting the receiver's modules. These perturbations are fused and applied only during the receiver's generation, enabling instance-level adaptation without permanently changing the model or enlarging the receiver's text context. With three Qwen3-4B agents, TFlow improves over a standalone receiver by up to 8.5 accuracy points across five benchmarks while reducing processed tokens by up to 32.69%. Compared with a text-based three-agent baseline, it reduces total processed tokens by up to 83.27% and the wall-clock inference time by up to 4.6$times$, while maintaining competitive accuracy on four of five benchmarks. These results suggest that transient low-rank weight perturbations can serve as an executable communication medium for efficient multi-agent LLM collaboration.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: