AIDB Daily Papers

GPT-OSSとの調和：OpenAIのスコア再現への道

原題: In harmony with gpt-oss

著者: Borislav Mavrin

公開日: 2026-04-01 | 分野: LLM ファインチューニングベンチマーク推論機械学習 AI オープンソースプロンプトモデルリポジトリ

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

GPT-OSS-20Bのスコアを再現するため、ツールとエージェントハーネスをリバースエンジニアリングした。
モデルが学習データ内のツールを高確率で呼び出すことを発見、強力な事前知識を示す。
ネイティブなハーネスを構築し、OpenAIの公開スコアを独立して再現することに成功した。

Abstract

No one has independently reproduced OpenAI's published scores for gpt-oss-20b with tools, because the original paper discloses neither the tools nor the agent harness. We reverse-engineered the model's in-distribution tools: when prompted without tool definitions, gpt-oss still calls tools from its training distribution with high statistical confidence -- a strong prior, not a hallucination. We then built a native harmony agent harness (https://github.com/borislavmavrin/harmonyagent.git) that encodes messages in the model's native format, bypassing the lossy Chat Completions conversion. Together, these yield the first independent reproduction of OpenAI's published scores: 60.4% on SWE Verified HIGH (published 60.7%), 53.3% MEDIUM (53.2%), and 91.7% on AIME25 with tools (90.4%).

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2604.00362
カテゴリ: cs.AI, cs.LG

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報