AIDB Daily Papers
PIIGuard:敵対的サニタイズ下での個人情報収集を緩和する
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- ウェブページレベルで個人情報(PII)の漏洩を防ぐPIIGuardを提案し、間接的なプロンプトインジェクションを利用した防御策を実装しました。
- この研究は、ウェブページ所有者がモデルやサービス層に依存せず、自身のページ上で個人情報漏洩を防御できる新たな手法を提示する点で重要です。
- PIIGuardは、3つの主要モデルで97%以上の防御成功率を達成し、通常の質問応答機能も維持することを確認しました。
Abstract
Browsing-enabled LLM assistants can fetch webpages and answer contact-seeking queries, creating a practical channel for scraping contact-style personally identifiable information (PII) from public pages. Many prior defenses are deployed at the model, service, or agent layer rather than at the webpage itself, leaving ordinary page owners with limited deployable options. We present PIIGuard, a webpage-level defense that repurposes indirect prompt injection as a protective mechanism: the page owner embeds optimized hidden HTML fragments that steer the model away from verbatim or reconstructible disclosure of contact PII. PIIGuard searches over fragment text and insertion position using rule-based leakage scoring, evolutionary mutation, and final judge-based recoverability assessment. In direct-HTML evaluation on three target models (GPT-5.4-nano, Claude-haiku-4.5, and DeepSeek-chat(latest v3.2)), PIIGuard achieves at least 97.0% defense success rate under both rule-based and judge-based leakage evaluation, often reaching 100.0%, while preserving benign same-page QA utility. We further evaluate two harder settings: public-URL browsing and attacker-side LLM sanitization of fetched webpage. These results show that page-side defensive fragments can remain effective in deployment for some model-position pairs, but robustness varies substantially across browsing interfaces and sanitizer prompts. Overall, PIIGuard demonstrates that page owners can use page-side fragments as a practical mitigation for web-grounded PII leakage.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: