AIDB Daily Papers
AIエージェントの嘘を見抜く:ツール記述の欺瞞性とリスク
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- AIエージェントが利用する外部ツールの説明と実際のコードの不一致に着目し、セキュリティリスクを調査した。
- MCP環境における記述とコードの不整合は、AIエージェントの誤った判断や不正な動作を招く可能性があるため重要である。
- 大規模な分析の結果、約13%のMCPサーバーに不整合が見られ、特権操作や不正な金融操作を可能にするリスクが示唆された。
Abstract
The Model Context Protocol (MCP) enables large language models to invoke external tools through natural-language descriptions, forming the foundation of many AI agent applications. However, MCP does not enforce consistency between documented tool behavior and actual code execution, even though MCP Servers often run with broad system privileges. This gap introduces a largely unexplored security risk. We study how mismatches between externally presented tool descriptions and underlying implementations systematically shape the mental models and decision-making behavior of intelligent agents. Specifically, we present the first large-scale study of description-code inconsistency in the MCP ecosystem. We design an automated static analysis framework and apply it to 10,240 real-world MCP Servers across 36 categories. Our results show that while most servers are highly consistent, approximately 13% exhibit substantial mismatches that can enable undocumented privileged operations, hidden state mutations, or unauthorized financial actions. We further observe systematic differences across application categories, popularity levels, and MCP marketplaces. Our findings demonstrate that description-code inconsistency is a concrete and prevalent attack surface in MCP-based AI agents, and motivate the need for systematic auditing and stronger transparency guarantees in future agent ecosystems.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: