AI Tools & AgentsKR

Why Karpathy's CLAUDE.md Got 48K Stars — And How to Write Your Own

One markdown file raised AI coding accuracy from 65% to 94%. Analyzing Karpathy's 4 rules and practical writing guide.

Why Karpathy's CLAUDE.md Got 48K Stars — And How to Write Your Own

Why Karpathy's CLAUDE.md Got 48K Stars — and How to Write Your Own

A single markdown file increased AI coding accuracy from 65% to 94%.

What 48,000 Stars Really Mean

In January 2026, Andrej Karpathy tweeted:

"The problem with LLM coding agents is no longer syntax errors. It's judgment."

As Karpathy's workflow shifted from "80% manual coding / 20% agent" to "80% agent / 20% corrections," he observed recurring failure patterns in LLMs. He distilled these observations into a single CLAUDE.md file, which became the andrej-karpathy-skills repository.

The result: Over 48,900 GitHub stars. 7,900 in the first 24 hours. Why did a single markdown file generate this response?

4 Failure Patterns When LLMs Code

The patterns Karpathy observed are ones "anyone who's used Claude Code has experienced":

1. Silent Assumptions

Tell an AI "refactor this function," and it charges ahead with its own interpretation without asking questions. It won't ask whether your database schema is PostgreSQL or MySQL—it just picks what it thinks is right and proceeds.

❌ What AI does: "I'll refactor this using the repository pattern..."
✅ What AI should do: "I see two possible approaches. Which do you prefer?"

2. Overengineering

Ask for "a login form" and you get OAuth 2.0 + JWT + refresh tokens + 2FA + rate limiting. What should be 100 lines becomes 1,000 lines.

❌ Request: "add validation"
   Result: 500 lines of custom validation framework

✅ Expected:
   10 lines of if statements for 3 fields

3. Scope Creep

You ask it to fix one bug, and it renames variables in adjacent code, rewrites comments, and reformats files. The diff shows 15 changed files when the actual fix was 3 lines.

❌ git diff: 200 lines changed, 3 meaningful changes
✅ git diff: exactly 3 lines changed

4. Lack of Judgment

It's syntactically correct, but there's no judgment about "is this actually good code?" What a senior developer would flag as "this is too complex," the AI confidently submits.

Karpathy's 4 Rules

Four rules address these failure patterns—that's the entire CLAUDE.md:

Rule 1: Think Before Coding

Don't assume. Don't hide confusion. Expose tradeoffs.

Before implementing:

  • Explicitly state your assumptions
  • If multiple interpretations exist, present options—don't quietly pick one
  • If there's a simpler approach, say so
  • If something's unclear, stop and ask

Rule 2: Simplicity First

Minimum code to solve the problem. Nothing speculative.
  • No features that weren't requested
  • No abstractions for single-use code
  • No unrequested "flexibility" or "configurability"
  • No error handling for impossible scenarios
  • If 200 lines can become 50 lines, rewrite it

Ask yourself: "Would a senior engineer look at this and say 'this is too complex'?" If yes, simplify.

Rule 3: Surgical Changes

Touch only what's necessary. Clean up only your own mess.

When editing existing code:

  • Don't "improve" adjacent code, comments, or formatting
  • Don't refactor what isn't broken
  • Follow existing style even if you'd prefer otherwise
  • If you find unrelated dead code, mention it instead of deleting

Clean up only what your changes made unused:

  • Remove unused imports/variables/functions created by your changes
  • Don't remove pre-existing dead code unless requested

Test: Every changed line should trace directly back to the user's request.

Rule 4: Goal-Driven Execution

Define success criteria. Iterate until verified.

Transform tasks into verifiable goals:

  • "add validation" → "write tests for invalid inputs and make them pass"
  • "fix bug" → "write a test that reproduces the bug, then make it pass"
  • "refactor X" → "verify tests pass before and after refactoring"

For multi-step tasks, present a simple plan:

1. [step] → verify: [how to confirm]
2. [step] → verify: [how to confirm]
3. [step] → verify: [how to confirm]

Strong success criteria enable the agent to iterate independently. Weak criteria ("make it work") require constant checking.

How to Write Your Own CLAUDE.md — A Practical Guide

You could use karpathy-skills as-is, but adding project-specific rules makes it far more effective. Here are battle-tested patterns.

Basic Structure

markdown
# Project Rules

## Coding Principles
(karpathy-skills' 4 rules)

## Project Rules
(rules specific to this project)

## Never Do
(things you absolutely must not do)

Tips for Writing Project Rules

1. Turn repeated corrections into rules

If you've given the AI the same feedback 3+ times, it should become a rule.

markdown
## Project Rules
- Korean content must always use 합니다/입니다 formal style
- Slug convention: Korean posts no suffix, English posts get -en suffix
- Only commit when user explicitly requests it

2. "Don't do" rules work better than "do" rules

LLMs have a strong tendency to add new things. Telling them what not to do is more effective.

markdown
## Never Do
- Never add docstrings or type annotations to existing code
- Prefer editing existing files over creating new ones
- Never git push unless explicitly requested
- Never add error handling based on speculation

3. Include concrete examples

Abstract rules get ignored. Give specific examples.

markdown
## Style Guide
- Component filenames: PascalCase (e.g., UserProfile.tsx)
- Utility functions: camelCase (e.g., formatDate.ts)
- API routes: kebab-case (e.g., /api/user-profile)

### Wrong:

// ❌ Don't do this

export default function userProfile() { ... }


### Right:

// ✅ Do this

export default function UserProfile() { ... }

4. Keep it concise

CLAUDE.md loads in full at the start of every session. Beyond 200 lines becomes token waste. Keep only the essentials.

SectionRecommended Weight
Coding Principles30% (karpathy rules or variants)
Project Rules40% (project-specific rules)
Never Do20% (prohibitions)
Examples10% (1-2 key examples only)

Real-World CLAUDE.md Example

markdown
# CLAUDE.md

## Coding Principles
- Think before coding. State assumptions explicitly. If uncertain, ask.
- Simplicity first. No features beyond what was asked.
- Surgical changes. Don't "improve" adjacent code.
- Goal-driven. Transform tasks into verifiable goals.

## Project Rules
- This is a Next.js 14 + Sanity CMS blog
- Korean posts: slug without suffix (e.g., `my-post`)
- English posts: slug with `-en` suffix (e.g., `my-post-en`)
- All Korean prose MUST use 합니다/입니다 formal style
- Upload scripts: always use `createOrReplace`, never `create`
- Thumbnails: matplotlib scripts in `scripts/`, output to `public/thumbnails/`

## Never Do
- Never commit without explicit user request
- Never push to remote without explicit user request
- Never add docstrings, type annotations, or comments to unchanged code
- Never create new files when editing existing ones would work
- Never use git --force or destructive operations without asking

## Tech Stack
- Frontend: Next.js 14 (App Router)
- CMS: Sanity v3 (project: 11k7hfqk)
- Auth: NextAuth.js
- Styling: Tailwind CSS
- Deploy: Vercel (auto-deploy from main)

Beyond CLAUDE.md

CLAUDE.md is powerful but has limits:

  • Manual—you have to write and maintain it yourself
  • Static—it can't capture the evolving context of an ongoing project
  • 200 lines is the practical ceiling

To go beyond these limits:

  1. claude-mem (59K stars)—plugin that auto-captures and injects cross-session context
  2. Claude Code's built-in memory—automatically learns patterns in ~/.claude/projects/<project>/memory/
  3. Custom knowledge bases—structure rules/decisions/patterns/parking-lot items with Obsidian + slash commands

CLAUDE.md is a starting point. Begin here, then evolve your memory system as your AI collaboration deepens.

Why a Single Markdown File Matters

48,000 stars aren't a response to this file's technical complexity. They're a response to the problem it solves.

AI coding agents are already impressive. They get syntax right, understand structure, and even write tests. But they lack judgment. They don't ask themselves, "is this too complex?" They don't check, "do I really need to touch this?"

Karpathy's CLAUDE.md injects that judgment externally. And that alone enables the leap from 65% to 94%.

If there's a mistake your AI repeatedly makes in your project, add it to CLAUDE.md right now. The ROI of a single markdown file is far larger than you'd think.

*Related projects:*

Part 2 of 3 complete

1 more part waiting for you

From theory to production deployment — subscribe to unlock the full series and all premium content.

Compare plans

Stay Updated

Follow us for the latest posts and tutorials

Subscribe to Newsletter

Related Posts