AIDB Daily Papers

学習段階から「忘れられる」大規模言語モデル：NULLs

原題: Natively Unlearnable Large Language Models

著者: Gaurav R. Ghosal, Pratyush Maini, Aditi Raghunathan

公開日: 2026-06-11 | 分野: LLM 機械学習 AI 自然言語処理 cs.CL cs.LG

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

特定の学習データの影響を削除する「アンラーニング」を、モデルの学習段階から実現するNULLsを提案した。
NULLsは、データソースごとの情報を分離しつつ、共有情報を統合学習できる新しいモデルアーキテクチャである。
NULLsは、個別の記事のアンラーニングで知識を保持し、再学習にも強く、標準的なTransformerと同等の性能を示した。

Abstract

Unlearning aims to remove the influence of specific training data sources, but this has proved challenging because the contributions of different sources are entangled within the model. Isolating source contributions to disjoint parameters makes removal easier, though it obstructs joint learning across sources. We propose NULLs (Natively Unlearnable LLMs), a model class that satisfies the two opposing goals of isolating source-specific contributions and learning jointly across sources, by training a set of shared backbone neurons alongside a pool of sparsely activated sinks. During training, information specific to a source naturally concentrates in its sinks while information shared across sources accumulates in the backbone. A source is then unlearned at deployment by disabling its corresponding sinks, with no gradient updates and no access to the retained data. We show that NULLs scales to Wikipedia's ~6M articles, isolating each as an independent source. Unlearning a single article removes knowledge specific to it while preserving facts shared with semantically related articles, closely matching retraining from scratch. We note that unlearning with NULLs is also robust: in a case study of unlearning the Harry Potter books, NULLs resists both adversarial extraction and relearning that reverses post-hoc unlearning. Finally, NULLs preserves general language capabilities, matching a standard transformer on downstream benchmarks. Together, these results suggest that source-level unlearning need not be an afterthought. It can be built natively into LLM training while retaining the benefits of shared representation learning.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2606.13873
カテゴリ: cs.LG, cs.CL

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報