AIDB Daily Papers

大規模言語モデルのアーキテクチャ別ニューラル活性化パターン：認知タスク遂行能力の包括的分析

原題: Neural Activation Patterns Across Language Model Architectures: A Comprehensive Analysis of Cognitive Task Performance

著者: Mahdi Naser-Moghadasi, Faezeh Ghaderi

公開日: 2026-05-14 | 分野: LLM NLP 機械学習 AI cs.CL cs.LG

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

6つの異なる大規模言語モデルアーキテクチャにおけるニューラル活性化パターンを包括的に分析した。
エンコーダーとデコーダーアーキテクチャが認知タスクをどのように処理するかの根本的な違いを明らかにし、数学的推論が最も高いアテンションエントロピーを生むことを発見した。
デコーダーモデルはエンコーダーモデルよりも有意に高いスパースパターンを示し、モデル選択と最適化への示唆を与えた。

Abstract

This paper presents a comprehensive analysis of neural activation patterns across six distinct large language model (LLM) architectures, examining their performance on twelve cognitive task categories. Through systematic measurement of final activation values, attention entropy, and sparsity patterns, we reveal fundamental differences in how encoder and decoder architectures process diverse cognitive tasks. Our analysis of 144 task-model combinations demonstrates that mathematical reasoning consistently produces the highest attention entropy across all architectures, while decoder models exhibit significantly higher sparsity patterns compared to encoder models. The findings provide critical insights into the computational characteristics of modern language models and their task-specific neural behaviors, with implications for model selection and optimization in big data applications.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2605.15436
カテゴリ: cs.CL, cs.LG

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報