SAE and TensorLens: The Age of Feature Interpretability
Individual neurons are uninterpretable. Sparse Autoencoders extract monosemantic features from model internals, and TensorLens analyzes the entire Transformer as a single unified tensor.

SAE and TensorLens: The Age of Feature Interpretability
In the previous two posts, we:
- Logit/Tuned Lens: Read the model's intermediate predictions
- Activation Patching: Traced which activations are causally responsible for the answer
But here we hit a fundamental problem:
Related Posts

Claude Sonnet 4.6: Opus-Level Performance, 40% Cheaper — Benchmark Deep Dive
Claude Sonnet 4.6 scores 79.6% on SWE-bench, 72.5% on OSWorld, and 1633 Elo on GDPval-AA — matching or beating Opus 4.6 on production tasks. $3/$15 vs $5/$25 per M tokens. Analysis of Adaptive Thinking, Context Compaction, and OSWorld growth trajectory.

MiniMax M2.5: Opus-Level Performance at $1 per Hour
MiniMax M2.5 achieves SWE-bench 80.2% using only 10B active parameters from a 230B MoE architecture. 1/20th the cost of Claude Opus with comparable coding performance. Forge RL framework, benchmark analysis, pricing comparison.

Backpropagation From Scratch: Chain Rule, Computation Graphs, and Topological Sort
How microgpt.py's 15-line backward() works. From high school calculus to chain rule, computation graphs, topological sort, and backpropagation.