AIDB Daily Papers

感情の持続状態を追跡するマルチモーダル隠れマルコフモデル

原題: Multimodal Hidden Markov Models for Persistent Emotional State Tracking

著者: Anamika Ragu, Aneesh Jonelagadda

公開日: 2026-05-13 | 分野: LLM NLP マルチモーダル AI cs.AI 感情認識隠れマルコフモデル

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

会話の感情の推移を、映像・音声・テキストのマルチモーダル情報から隠れマルコフモデルで追跡するフレームワークを提案した。
従来の単発的な感情認識と異なり、会話の持続的な感情フェーズを捉えることで、より解釈性の高い分析を可能にした。
臨床データでの評価により、感情フェーズの回復とLLM応答の質向上に貢献できる可能性を示した。

Abstract

Tracking an interpretable emotional arc of a conversation via the sentiment of individual utterances processed as a whole is central to both understanding and guiding communication in applied, especially clinical, conversational contexts. Existing approaches to emotion recognition operate at the utterance level, obscuring the persistent phases that characterize real conversational dynamics. We propose a lightweight framework that models conversational emotion as a sequence of latent emotional regimes using sticky factorial HDP-HMMs over multimodal valence-arousal representations derived from simultaneous video, audio and textual input. We evaluate the quality of regime prediction using LLM-as-a-Judge, geometric, and temporal consistency metrics, demonstrating that the sticky HDP-HMM produces more interpretable regime sequences than the baseline Gaussian HMM at a fraction of the computational cost of LLM-based dialogue state tracking methods. In addition, Question-Answer experiments in a clinical dataset suggest that meaningful emotional phases can reliably be recovered from multimodal valence-arousal trajectories and used to improve the quality of LLM responses in unstable affective regimes via context augmentation. This framework thus opens a path toward interpretable, lightweight, and actionable analysis of conversational emotion dynamics at scale.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2605.12838
カテゴリ: cs.AI

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報