Build Your Own autoresearch — Applying Autonomous Experimentation to Any Domain
Apply the autoresearch pattern to text classification, image classification, and RAG pipelines. Includes a universal experiment runner and program.md template.

Build Your Own autoresearch -- Applying Autonomous Experimentation to Any Domain
Karpathy's autoresearch is an autonomous experimentation system built for LLM pretraining. In Part 1 we covered the overall architecture, and in Part 2 we dug into the agent's experimentation strategy and result analysis. If you've read this far, one question is probably on your mind:
"Can I use this for my own problem?"
In this post, we extract the core patterns from autoresearch and apply them to three domains: text classification, image classification, and RAG pipelines. At the end, we provide a general-purpose experiment runner and a program.md template you can adapt immediately.
Series: Part 1: Architecture | Part 2: Experiment Strategy | Part 3 (this post)
Extracting the Core Pattern from autoresearch
The structure running through all of autoresearch is surprisingly simple. Three files, a five-step loop, and a handful of design principles. Extract these, and you can apply the pattern to any ML task.
The 3-File Architecture
Here's autoresearch's file structure broken down by role:
| File | Role | Modified by |
|---|---|---|
prepare.py | Fixed infrastructure (data, evaluation, utilities) | Human (once) |
train.py | Experimentation target (model, hyperparameters, training loop) | Agent (every experiment) |
program.md | Agent protocol (experiment rules, evaluation criteria) | Human (meta-optimization) |
Related Posts

Paperclip — The Open-Source Framework for Running AI Agent Companies
30K GitHub stars in 3 weeks. An open-source multi-agent orchestration platform with org charts, budgets, and governance. Heartbeat scheduling, per-agent monthly budgets, and company templates.

AgentScope Production Deployment — Runtime, Monitoring, Scaling
Docker deployment with agentscope-runtime, OpenTelemetry tracing, AgentScope Studio, RL fine-tuning, production checklist.

AgentScope Realtime Voice Agents — Build 3 Voice AI Apps
Build 3 real voice AI apps — chatbot, simultaneous interpreter, and customer service bot with RealtimeAgent + Gradio.