The line between "AI writes about science" and "AI does science" is getting blurry fast. A new class of LLM agents doesn't just summarize papers or write code — they design experiments, control lab equipment, and iterate on results with minimal human intervention.
AI-Mandel: The AI Physicist
Built by Sören Arlt, Xuemei Gu, and Mario Krenn[1], AI-Mandel is an LLM agent that generates and implements ideas in quantum physics. Here's the workflow:
- Reads literature — Formulates novel research ideas from existing papers
- Uses PyTheus — A graph-based quantum experiment design framework that translates concepts into concrete lab setups
- Generates Python code — Produces experiment-ready programs
- Ideas are actually good — Two of its generated ideas already led to independent follow-up papers by human researchers
Specific ideas it came up with include new variations of quantum teleportation, quantum network primitives using indefinite causal orders, and novel geometric phases based on closed loops of quantum information transfer.
Coscientist: The Chemistry Robot
From Carnegie Mellon University[2], Coscientist (powered by GPT-4) operates in actual chemistry labs. It has:
- Executed palladium-catalyzed cross-coupling reactions (Suzuki–Miyaura and Sonogashira)
- Designed experiments faster and more accurately than humans working alone
- A modular architecture with a central "planner" that commands GOOGLE, PYTHON, DOCUMENTATION, and EXPERIMENT modules
It bridges LLM reasoning with physical lab equipment — not just simulation.
The AI Scientist: Full Pipeline Automation
Sakana AI's The AI Scientist[3] (August 2024) aims for the full research lifecycle:
- Brainstorms novel research ideas
- Writes and executes code
- Runs experiments
- Generates complete scientific manuscripts
- Performs automated peer review on its own output
It produces papers for about $15 each. One of its generated papers was accepted to a workshop track at a top ML conference.
Agent Laboratory: The Research Team
Agent Laboratory[4] uses multiple specialized LLM agents working together:
- PhD agents — Literature review via arXiv
- MLE-solver — Technical implementation
- Paper-solver agents — Report writing
There's also AgentRxiv[5] (March 2025), where autonomous agents upload, retrieve, and build on each other's research — cumulative scientific discovery without human intermediaries.
The Three Levels of Autonomy
Researchers classify LLM scientific agents into three tiers:
- Level 1 — Tool LLMs: Assist with single tasks (summarize, code snippet). Human directs everything.
- Level 2 — Analyst LLMs: Chain subtasks, perform statistical analysis, synthesize multiple documents. Reduced human intervention.
- Level 3 — Scientist LLMs: Full autonomy: hypothesis generation, experimental design, execution, and paper drafting with minimal human oversight.
Why This Matters
We're not at "AI replaces scientists" yet. But we're at "AI is a genuine co-investigator" — one that works 24/7, reads everything, never forgets a paper, and can iterate through a thousand failed ideas before breakfast. The bottleneck is shifting from "who has the idea" to "who validates it."
References
- [1] Arlt, S., Gu, X., & Krenn, M. (2025). AI-Mandel: Towards autonomous quantum physics research using LLM agents. [arXiv:2511.11752]
- [2] Boiko, D. A., et al. (2023). Autonomous chemical research with large language models. Nature 624, 570–578. [DOI]
- [3] Lu, C., et al. (2024). The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. [arXiv:2408.06292]
- [4] Schmidgall, S., et al. (2025). Agent Laboratory: Using LLM Agents as Research Assistants. [arXiv:2501.04227]
- [5] Ghosal, A., et al. (2025). AgentRxiv: Towards Collaborative AI Research. [arXiv:2503.18102]
Further Reading
- PyTheus — Graph-based quantum experiment design
- The AI Scientist — Sakana AI
- Agent Laboratory — GitHub
- Coscientist — Carnegie Mellon