AIDB Daily Papers
同じモデルでもサービスが異なるとは?ホスト型オープンウェイトLLM APIの実態計測
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- オープンウェイトLLMをモデル名ではなく、プロバイダー固有のサービスとして捉え、その実態を計測した研究である。
- モデルのバージョン、価格、レイテンシ、スループットなどのサービスレイヤーが「同じモデル」の意味をどう変えるかを分析した点が新しい。
- 需要は特定のモデルファミリーに集中する一方、古いバージョンも現役であり、プロバイダーの提供範囲と実際の利用は乖離していた。
Abstract
Open-weight large language models (LLMs) are usually named as model artifacts, but production users often consume them as hosted API services. This paper argues that the operational unit is a service object: a provider-specific, time-varying endpoint defined by model variant, protocol behavior, context capacity, listed price, latency and throughput distribution, reliability, and task feasibility. Using sampled request logs, provider metadata, compatibility probes, pricing snapshots, and continuous latency measurements collected by AI Ping during Q4 2025, we study how this service layer changes the meaning of "the same model." Three empirical patterns emerge. First, observed demand is concentrated but persistent across versions: in the displayed family aggregate, the largest family carries 32.0% of relative demand and the top five carry 87.4%, with a Gini coefficient of 0.693, while older variants remain active after newer releases. Second, supply and use separate: provider listing breadth does not imply realized adoption, and listed prices are more anchored than latency, throughput, context length, protocol support, and error semantics. Third, task mix matters: applications induce different token-length regimes, so provider choice is a constrained decision over provider-model-task-time tuples rather than a lookup by model name. In two representative counterfactuals under observed feasibility constraints, routing lowers Qwen3-32B cost by 37.8% and raises DeepSeek-V3.2 average throughput by about 90% relative to direct official access. The results support a measurement view of hosted open-weight LLMs as heterogeneous services, not static catalog entries. We open-source the measurement methodology and reproduction artifacts at https://github.com/haoruilee/llm_api_measurement_study to support result reproduction.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: