AIDB Daily Papers
AI科学者エージェントは実験ループからのフィードバックで学習できるか?反復摂動発見からの証拠
※ 日本語タイトル・ポイントはAIによる自動生成です。正確な内容は原論文をご確認ください。
ポイント
- Cell Paintingハイスループットスクリーニングにおいて、LLMエージェントが実験フィードバックに基づいて仮説を更新する能力を検証した。
- 実験フィードバックへのアクセスは、事前学習知識のみに頼る場合と比較して、発見率を平均53.4%向上させ、真のフィードバック駆動型学習を示唆する。
- モデル能力の向上(Claude Sonnet 4.5から4.6へ)により、遺伝子ハルシネーション率が大幅に低下し、インコンテキスト学習の効果が顕著に現れた。
Abstract
Recent work has questioned whether large language models (LLMs) can perform genuine in-context learning (ICL) for scientific experimental design, with prior studies suggesting that LLM-based agents exhibit no sensitivity to experimental feedback. We shed new light on this question by carrying out 800 independently replicated experiments on iterative perturbation discovery in Cell Painting high-content screening. We compare an LLM agent that iteratively updates its hypotheses using experimental feedback to a zero-shot baseline that relies solely on pretraining knowledge retrieval. Access to feedback yields a $+53.4%$ increase in discoveries per feature on average ($p = 0.003$). To test whether this improvement arises from genuine feedback-driven learning rather than prompt-induced recall of pretraining knowledge, we introduce a random feedback control in which hit/miss labels are permuted. Under this control, the performance gain disappears, indicating that the observed improvement depends on the structure of the feedback signal ($+13.0$ hits, $p = 0.003$). We further examine how model capability affects feedback utilization. Upgrading from Claude Sonnet 4.5 to 4.6 reduces gene hallucination rates from ${sim}33%$--$45%$ to ${sim}3$--$9%$, converting a non-significant ICL effect ($+0.8$, $p = 0.32$) into a large and highly significant improvement ($+11.0$, $p=0.003$) for the best ICL strategy. These results suggest that effective in-context learning from experimental feedback emerges only once models reach a sufficient capability threshold.
Paper AI Chat
この論文のPDF全文を対象にAIに質問できます。
質問の例: