AIDB Daily Papers

AgentDS：ドメイン特化型データサイエンスにおける人とAIの協調の未来を測る

原題: AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science

著者: An Luo, Jin Du, Xun Xian, Robert Specht, Fangqiao Tian, Ganghua Wang, Xuan Bi, Charles Fleming, Ashish Kundu, Jayanth Srinivasa, Mingyi Hong, Rui Zhang, Tianxi Li, Galin Jones, Jie Ding

公開日: 2026-03-19 | 分野: 医療AI 金融AI データセットベンチマーク人間機械学習 AI 評価言語分析食品協調

※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。

ポイント

ドメイン特化型データサイエンスにおけるAIエージェントと人間の協調性能を評価するAgentDSベンチマークを導入しました。
大規模言語モデルの発展にも関わらず、AIエージェントはドメイン知識を要する推論で人間の専門家に及ばないことが明らかになりました。
AgentDSの競技の結果、AI単独では人間の平均レベルに留まり、人間とAIの協調が最も優れた解決策を生み出すことが示されました。

Abstract

Data science plays a critical role in transforming complex data into actionable insights across numerous domains. Recent developments in large language models (LLMs) and artificial intelligence (AI) agents have significantly automated data science workflow. However, it remains unclear to what extent AI agents can match the performance of human experts on domain-specific data science tasks, and in which aspects human expertise continues to provide advantages. We introduce AgentDS, a benchmark and competition designed to evaluate both AI agents and human-AI collaboration performance in domain-specific data science. AgentDS consists of 17 challenges across six industries: commerce, food production, healthcare, insurance, manufacturing, and retail banking. We conducted an open competition involving 29 teams and 80 participants, enabling systematic comparison between human-AI collaborative approaches and AI-only baselines. Our results show that current AI agents struggle with domain-specific reasoning. AI-only baselines perform near or below the median of competition participants, while the strongest solutions arise from human-AI collaboration. These findings challenge the narrative of complete automation by AI and underscore the enduring importance of human expertise in data science, while illuminating directions for the next generation of AI. Visit the AgentDS website here: https://agentds.org/ and open source datasets here: https://huggingface.co/datasets/lainmn/AgentDS .

Paper AI Chat

この論文のPDF全文を対象にAIに質問できます。

質問の例:

AIチャット機能を利用するには、ログインまたは会員登録（無料）が必要です。

会員登録 / ログイン

arXivで読む PDFを開く

メタ情報

arXiv ID: 2603.19005
カテゴリ: cs.LG, cs.AI, stat.ME

ポイント

Abstract

Paper AI Chat

関連するAIDB記事

メタ情報