Running autoresearch Hands-On — Overnight Experiments on a Single GPU
From environment setup to agent execution and overnight results analysis. Tuning guide for smaller GPUs and practical tips.

Running autoresearch Hands-On — Overnight Experiments on a Single GPU
In Part 1, we looked at how Karpathy's autoresearch is structured. Here's the three-line summary:
- A single
train.pycontains the GPT model + optimizer + training loop. - An AI agent (Claude Code, etc.) modifies this file, trains for 5 minutes, and keeps the change if val_bpb improves — otherwise discards it.
program.mddefines the agent's behavior rules. Humans only edit this markdown file.
Related Posts

AI Tools & Agents
Build Your Own autoresearch — Applying Autonomous Experimentation to Any Domain
Apply the autoresearch pattern to text classification, image classification, and RAG pipelines. Includes a universal experiment runner and program.md template.

AI Tools & Agents
Inside Karpathy's autoresearch — Building an AI Research Lab in 630 Lines
A code-level deep dive into Karpathy's autoresearch. Dissecting train.py, BPE tokenizer, MuonAdamW optimizer, and the agent protocol design.

Ops & Systems
Agent Production — From Guardrails to Docker Deployment
Build safe agents with 3-layer Guardrails (Input/Output/Semantic), deploy with FastAPI + Docker. Includes HITL, rate limiting, and production monitoring checklist.