AIDB Daily Papers
OpenClawを飼い慣らす:自律型LLMエージェントのセキュリティ分析と脅威軽減
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- OpenClawのような自律型LLMエージェントに対する包括的なセキュリティ脅威分析を実施し、新たな攻撃手法を明らかにした。
- エージェントのライフサイクル全体を捉える5層フレームワークを導入し、複合的な脅威を体系的に分析する点が新しい。
- 間接的なプロンプトインジェクションやスキルサプライチェーン汚染など、複数の脅威の実例と既存防御の限界を示した。
Abstract
Autonomous Large Language Model (LLM) agents, exemplified by OpenClaw, demonstrate remarkable capabilities in executing complex, long-horizon tasks. However, their tightly coupled instant-messaging interaction paradigm and high-privilege execution capabilities substantially expand the system attack surface. In this paper, we present a comprehensive security threat analysis of OpenClaw. To structure our analysis, we introduce a five-layer lifecycle-oriented security framework that captures key stages of agent operation, i.e., initialization, input, inference, decision, and execution, and systematically examine compound threats across the agent's operational lifecycle, including indirect prompt injection, skill supply chain contamination, memory poisoning, and intent drift. Through detailed case studies on OpenClaw, we demonstrate the prevalence and severity of these threats and analyze the limitations of existing defenses. Our findings reveal critical weaknesses in current point-based defense mechanisms when addressing cross-temporal and multi-stage systemic risks, highlighting the need for holistic security architectures for autonomous LLM agents. Within this framework, we further examine representative defense strategies at each lifecycle stage, including plugin vetting frameworks, context-aware instruction filtering, memory integrity validation protocols, intent verification mechanisms, and capability enforcement architectures.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: