AIDB Daily Papers

脳とLLMのアライメントは言語タイプではなく学習データに依存する

原題: Brain-LLM Alignment Tracks Training Data, Not Typology

著者: Dongxin Guo, Jikun Wu, Siu Ming Yiu

公開日: 2026-05-21 | 分野: LLM 自然言語処理 cs.CL cs.AI q-bio.NC 脳科学

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

脳と大規模言語モデル（LLM）のアライメントが英語以外でも成立するかを検証した。
LLMのアライメントは英語優位ではなく、学習データの言語構成に依存することが示された。
言語のタイプ的距離や構文処理領域がアライメントの変動に影響を与えることが発見された。

Abstract

Brain-LLM alignment is well established in English, yet the brain's language network is neuroanatomically universal across languages. Does alignment also generalize cross-linguistically, and what governs the variation? We test this using fMRI data from 112 participants across English, Chinese, and French (the Le Petit Prince corpus) and seven LLMs spanning English-dominant, Chinese-dominant, and multilingual architectures. Our central finding is that training-language dominance, not an inherent property of English, drives the alignment pattern: a Chinese-dominant model (Baichuan2-7B), architecture-matched to LLaMA-2-7B, reverses the gradient entirely, aligning best with Chinese brains and worst with English. Beyond training dominance, formal typological distance independently covaries with alignment degradation, syntax-associated brain regions (IFG) show $2.3times$ steeper typological gradients than lexico-semantic regions (PTL), and tokenization fertility accounts for $sim$60% of a cross-linguistic shift in optimal encoding layer. These results reveal that the apparent "English advantage" in brain-LLM alignment is an artifact of training data composition, while the remaining variation reflects genuine typological structure concentrated in syntactic processing.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2605.23032
カテゴリ: cs.CL, cs.AI, q-bio.NC

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報