AIDB Daily Papers
視線から導きへ:マルチモーダル視線認識AIアシスタントによるユーザーの認知ニーズの解釈と適応
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- ユーザーの困難点を特定し、フォローアップ支援を行う視線認識AIアシスタントを提示。
- 視線情報から認知ニーズを推測し、より正確でパーソナライズされた支援を実現する点が新しい。
- 視線認識アシスタントは、ユーザーの記憶力と効率的な対話の改善に貢献した。
Abstract
Current LLM assistants are powerful at answering questions, but they have limited access to the behavioral context that reveals when and where a user is struggling. We present a gaze-grounded multimodal LLM assistant that uses egocentric video with gaze overlays to identify likely points of difficulty and target follow-up retrospective assistance. We instantiate this vision in a controlled study (n=36) comparing the gaze-aware AI assistant to a text-only LLM assistant. Compared to a conventional LLM assistant, the gaze-aware assistant was rated as significantly more accurate and personalized in its assessments of users' reading behavior and significantly improved people's ability to recall information. Users spoke significantly fewer words with the gaze-aware assistant, indicating more efficient interactions. Qualitative results underscored both perceived benefits in comprehension and challenges when interpretations of gaze behaviors were inaccurate. Our findings suggest that gaze-aware LLM assistants can reason about cognitive needs to improve cognitive outcomes of users.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: