Build Your Own autoresearch — Applying Autonomous Experimentation to Any Domain
Apply the autoresearch pattern to text classification, image classification, and RAG pipelines. Includes a universal experiment runner and program.md template.

Build Your Own autoresearch -- Applying Autonomous Experimentation to Any Domain
Karpathy's autoresearch is an autonomous experimentation system built for LLM pretraining. In Part 1 we covered the overall architecture, and in Part 2 we dug into the agent's experimentation strategy and result analysis. If you've read this far, one question is probably on your mind:
"Can I use this for my own problem?"
In this post, we extract the core patterns from autoresearch and apply them to three domains: text classification, image classification, and RAG pipelines. At the end, we provide a general-purpose experiment runner and a program.md template you can adapt immediately.
Series: Part 1: Architecture | Part 2: Experiment Strategy | Part 3 (this post)
Related Posts

Running autoresearch Hands-On — Overnight Experiments on a Single GPU
From environment setup to agent execution and overnight results analysis. Tuning guide for smaller GPUs and practical tips.

Inside Karpathy's autoresearch — Building an AI Research Lab in 630 Lines
A code-level deep dive into Karpathy's autoresearch. Dissecting train.py, BPE tokenizer, MuonAdamW optimizer, and the agent protocol design.

Agent Production — From Guardrails to Docker Deployment
Build safe agents with 3-layer Guardrails (Input/Output/Semantic), deploy with FastAPI + Docker. Includes HITL, rate limiting, and production monitoring checklist.