AIDB Daily Papers

LLMのための人間のようなロギング：実行時フィードバックによるロギング再考

原題: Logging Like Humans for LLMs: Rethinking Logging via Execution and Runtime Feedback

著者: Xin Wang, Yang Feng, Jiaoxiao Qian, Yang Zhang, Zhenhao Li, Zishuo Ding

公開日: 2026-03-31 | 分野: LLM AI ソフトウェアツールコードプログラミングテストデバッグエラー自然言語処理

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

実行時フィードバックに基づき、LLMを活用して反復的にログを生成・改善するフレームワークReLogを提案。
従来の人間のログとの類似性評価ではなく、デバッグタスクの性能で評価する点が新しい。
ReLogは既存手法を上回り、欠陥特定と修正において高いF1スコアと修正成功数を示した。

Abstract

Logging statements are essential for software debugging and maintenance. However, existing approaches to automatic logging generation rely on static analysis and produce statements in a single pass without considering runtime behavior. They are also typically evaluated by similarity to developer-written logs, assuming these logs form an adequate gold standard. This assumption is increasingly limiting in the LLM era, where logs are consumed not only by developers but also by LLMs for downstream tasks. As a result, optimizing logs for human similarity does not necessarily reflect their practical utility. To address these limitations, we introduce ReLog, an iterative logging generation framework guided by runtime feedback. ReLog leverages LLMs to generate, execute, evaluate, and refine logging statements so that runtime logs better support downstream tasks. Instead of comparing against developer-written logs, we evaluate ReLog through downstream debugging tasks, including defect localization and repair. We construct a benchmark based on Defects4J under both direct and indirect debugging settings. Results show that ReLog consistently outperforms all baselines, achieving an F1 score of 0.520 and repairing 97 defects in the direct setting, and the best F1 score of 0.408 in the indirect setting where source code is unavailable. Additional experiments across multiple LLMs demonstrate the generality of the framework, while ablations confirm the importance of iterative refinement and compilation repair. Overall, our work reframes logging as a runtime-guided, task-oriented process and advocates evaluating logs by their downstream utility rather than textual similarity.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2603.29122
カテゴリ: cs.SE

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報