AIDB Daily Papers

自己進化型ビジュアルクエスチョナー：外部監督なしでAIが自ら問いを生成し学習する

原題: Self-Evolving Visual Questioner

著者: Yijun Liang, Hengguang Zhou, Ming Li, Lichen Li, Cho-Jui Hsieh, Tianyi Zhou

公開日: 2026-06-11 | 分野: LLM AI cs.CV cs.LG AI支援 AI評価

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

外部監督なしで、ビジョン・言語モデル（VLM）が自ら問いを生成し、学習を継続するフレームワークを提案した。
既存のビジュアルクエスチョナーは高品質な学習データに依存するが、本研究はデータ収集コストを削減し、より難易度の高い問いを生成できる点を新規とする。
実験により、提案手法が問いの質と難易度を大幅に向上させ、学習予算内で静的データよりも効果的であり、回答者としても優れていることを示した。

Abstract

Vision-language models (VLMs) are typically trained as passive answerers, while their ability to actively ask diverse, non-trivial, visual-centric and grounded questions remains underexplored. Existing visual questioners' performance is bottlenecked by the availability of high-quality training data or the cost of curating them. We show that a VLM can continuously improve itself as a visual questioner without any external supervision. We propose a self-evolving framework that uses a VLM itself as both a proposer and a filter to produce harder, more informative, and visual-centric questions, while maintaining their exploration diversity to avoid training collapse. These questions are then used to train the VLM in both questioner and answerer modes. To evaluate the questioner, we introduce an agentic protocol that assesses questions along perception, reasoning, and diversity dimensions. Experiments across various backbone VLMs show that our method substantially enhances the quality and substantially expands the difficulty boundary of autonomous question generation. Under the same budget, our self-supervision is more effective than training on the static source data. Moreover, the self-evolving questioner remains a competitive or even better answerer.

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2606.13929
カテゴリ: cs.CV, cs.LG

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報