AIDB Daily Papers

知識プローブでLLMのパラメータ数を推定する新手法

原題: Incompressible Knowledge Probes: Estimating Black-Box LLM Parameter Counts via Factual Capacity

著者: Bojie Li

公開日: 2026-04-27 | 分野: LLM 機械学習 AI 研究深層学習大規模言語モデル cs.AI cs.LG

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

事実に基づいた知識の量を測定することで、大規模言語モデルのパラメータ数を推定する手法を提案しました。
従来の推論コストによる推定よりも不確実性が低く、モデルの知識容量を直接評価する新しいアプローチです。
提案手法は、多くのオープンウェイトモデルで高い精度を示し、特にMixture-of-Expertsモデルでは総パラメータ数が知識量をより良く予測することがわかりました。

Abstract

Closed-source frontier labs do not disclose parameter counts, and the standard alternative -- inference economics -- carries $2times$+ uncertainty from hardware, batching, and serving-stack assumptions external to the model. We exploit a tighter intrinsic bound: storing $F$ facts requires at least $F/$(bits per parameter) weights, so measuring how much a model emph{knows} lower-bounds how many parameters it emph{has}. We introduce textbf{Incompressible Knowledge Probes (IKPs)}, a benchmark of 1{,}400 factual questions spanning 7 tiers of obscurity, designed to isolate knowledge that cannot be derived by reasoning or compressed by architectural improvements. We calibrate a log-linear mapping from IKP accuracy to parameter count on 89 open-weight models (135M--1,600B) spanning 19 vendors, achieving $R^2 = 0.917$; leave-one-out cross-validation confirms generalization (median fold error $1.59times$, $68.5%$ within $2times$ and $87.6%$ within $3times$). For Mixture-of-Experts models, total parameters predict knowledge ($R^2 = 0.79$) far better than active parameters ($R^2 = 0.51$). We evaluate 188 models from 27 vendors and estimate effective knowledge capacity for all major proprietary frontier models; for heavily safety-tuned models the estimates are lower bounds, since refusal policy can hide tens of percentage points of "refused but known" capacity. The widely-reported saturation of reasoning benchmarks does not imply the end of scaling. Procedural capability compresses under the "Densing Law," but across 96 dated open-weight models the IKP time coefficient is $-0.0010$/month (95% CI $[-0.0031, +0.0008]$) -- indistinguishable from zero, and rejecting the Densing prediction of $+0.0117$/month at $p < 10^{-15}$. Factual capacity continues to scale log-linearly with parameters across generations and across vendors.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

💬 ディスカッション

ディスカッションに参加するにはログインが必要です。

ログイン / アカウント作成 →

arxivで読む PDFを開く

メタ情報

arxiv ID: 2604.24827
カテゴリ: cs.LG, cs.AI

ポイント

Abstract

Paper AI Chat

💬 ディスカッション

関連するAIDB記事

メタ情報