AIDB Daily Papers

SalsaAgent：インタラクティブなダンス生成のためのマルチモーダル身体言語モデル

原題: SalsaAgent: A multimodal embodied language model for interactive dance generation

著者: Payam Jome Yazdian, Zoe Stanley, Angelica Lim

公開日: 2026-05-28 | 分野: AI cs.AI cs.CV AIエージェント AI支援 AI評価

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

人間とロボットのインタラクションをダンス生成に拡張し、音楽と人間の動きに反応するSalsaAgentを開発した。
本研究は、身体の動きや関係性を表す新しいトークンをLLMに導入し、インタラクションを非言語的な動きのトークン交換として捉える点で重要である。
SalsaAgentは、音楽やパートナーとの協調性、空間的な振る舞いにおいて、既存手法を大幅に上回る結果を示した。

Abstract

Interaction between humanoids involves bidirectional and nonverbal reactivity, coordination and synchrony. Toward socially aware robots and interactive virtual agents, we present SalsaAgent, a language model that generates expressive, full-body salsa dance motions in reaction to a human leader and against a contextual music backdrop. We formulate interaction as nonverbal motion token passing, extending the vocabulary of a large language model (LLM) to process discrete motion tokens, pairwise relation tokens, and audio. Our contributions include new tokens for full-body and motion relations, LLM fine-tuning using automatically derived text descriptions of skeleton dynamics for token grounding, and a two-stage token-to-diffusion pipeline. Subjective and objective evaluations demonstrate the effectiveness of our approach in terms of motion quality, music and partner coordination, and consistent two-person spatial behavior, with significant improvements over baselines.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2605.29219
カテゴリ: cs.CV

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報