AIDB Daily Papers
大規模言語モデルにおける社会的意味:構造、規模、そして語用論的プロンプティング
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- LLMが示す語用論的・社会的推論の人間らしいパターンを定量的に評価する研究。
- 効果量比(ESR)とキャリブレーション偏差スコア(CDS)という新たな指標を導入し、構造的忠実度と規模のキャリブレーションを区別。
- 話者の知識や動機を推論させるプロンプトが、規模のずれを減少させ、モデルの精度向上に寄与することを発見。
Abstract
Large language models (LLMs) increasingly exhibit human-like patterns of pragmatic and social reasoning. This paper addresses two related questions: do LLMs approximate human social meaning not only qualitatively but also quantitatively, and can prompting strategies informed by pragmatic theory improve this approximation? To address the first, we introduce two calibration-focused metrics distinguishing structural fidelity from magnitude calibration: the Effect Size Ratio (ESR) and the Calibration Deviation Score (CDS). To address the second, we derive prompting conditions from two pragmatic assumptions: that social meaning arises from reasoning over linguistic alternatives, and that listeners infer speaker knowledge states and communicative motives. Applied to a case study on numerical (im)precision across three frontier LLMs, we find that all models reliably reproduce the qualitative structure of human social inferences but differ substantially in magnitude calibration. Prompting models to reason about speaker knowledge and motives most consistently reduces magnitude deviation, while prompting for alternative-awareness tends to amplify exaggeration. Combining both components is the only intervention that improves all calibration-sensitive metrics across all models, though fine-grained magnitude calibration remains only partially resolved. LLMs thus capture inferential structure while variably distorting inferential strength, and pragmatic theory provides a useful but incomplete handle for improving that approximation.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: