AIDB Daily Papers
文で考えろ!文区切り明示による言語モデル能力向上
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- 大規模言語モデルの入力を文単位で区切ることで、モデルが文構造を意識した処理を行うように促す手法を提案。
- 既存研究が着目しなかった自然言語の文レベル構造に着目し、認知科学的アプローチでLLMの性能向上を目指す点が新しい。
- 7Bから600Bのモデルで実験し、GSM8kで最大7.7%、DROPで最大12.5%の性能向上を確認。内部表現も文意識を持つことを実証。
Abstract
Researchers have explored different ways to improve large language models (LLMs)' capabilities via dummy token insertion in contexts. However, existing works focus solely on the dummy tokens themselves, but fail to leverage the inherent sentence-level structure of natural language. This is a critical oversight, as LLMs acquire linguistic capabilities through exposure to human-generated texts, which are inherently structured at the sentence level. Motivated by this gap, we propose an approach that inserts delimiters at sentence boundaries in LLM inputs, which not only integrates dummy tokens into the context, but also facilitates LLMs with sentence-by-sentence processing behavior during reasoning. Two concrete methods: (1). In-context learning and (2). Supervised fine-tuning are experimented using 7B models to 600B Deepseek-V3. Our results demonstrate consistent improvements across various tasks, with notable gains of up to 7.7% on GSM8k and 12.5% on DROP. Furthermore, the fine-tuned LLMs can incorporate sentence awareness evidenced by their internal representations. Our work establishes a simple yet effective technique for enhancing LLM's capabilities, offering promising directions for cognitive-inspired LLM enhancement paradigm.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: