AIDB Daily Papers

エージェントスキルが悪用される？LLMの脆弱性を突く新たなプロンプトインジェクション

原題: Agent Skills Enable a New Class of Realistic and Trivially Simple Prompt Injections

著者: David Schmotz, Sahar Abdelnabi, Maksym Andriushchenko

公開日: 2025-10-30 | 分野: LLM NLP 安全性

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

LLMのエージェントスキル機能が悪用され、機密情報が漏洩する脆弱性があることを示した。
簡単なプロンプトインジェクションでシステムレベルの安全対策を回避できる点が、従来にない脅威。
悪意ある指示を隠蔽し、機密データ窃取や有害なアクション実行が可能になることを実証。

Abstract

Enabling continual learning in LLMs remains a key unresolved research challenge. In a recent announcement, a frontier LLM company made a step towards this by introducing Agent Skills, a framework that equips agents with new knowledge based on instructions stored in simple markdown files. Although Agent Skills can be a very useful tool, we show that they are fundamentally insecure, since they enable trivially simple prompt injections. We demonstrate how to hide malicious instructions in long Agent Skill files and referenced scripts to exfiltrate sensitive data, such as internal files or passwords. Importantly, we show how to bypass system-level guardrails of a popular coding agent: a benign, task-specific approval with the "Don't ask again" option can carry over to closely related but harmful actions. Overall, we conclude that despite ongoing research efforts and scaling model capabilities, frontier LLMs remain vulnerable to very simple prompt injections in realistic scenarios. Our code is available at https://github.com/aisa-group/promptinject-agent-skills.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

💬 ディスカッション

ディスカッションに参加するにはログインが必要です。

ログイン / アカウント作成 →

arxivで読む PDFを開く

メタ情報

arxiv ID: 2510.26328
カテゴリ: cs.LG

ポイント

Abstract

Paper AI Chat

💬 ディスカッション

関連するAIDB記事

メタ情報