Fine-tuning Gemma 4 MoE — Customizing Arena #6 with 3.8B Active Parameters
Apply QLoRA to Gemma 4 26B MoE. Expert layer LoRA strategies, Dense vs MoE comparison, MoE-specific training tips, and Ollama deployment. LoRA Series Part 4.
Fine-tuning Gemma 4 MoE — Customizing Arena #6 with Just 3.8B Active Parameters
Series: Part 1: LoRA Theory | Part 2: QLoRA + Custom Data | Part 3: Eval + Deploy | Part 4 (this post)
Parts 1-3 covered LoRA fundamentals through deployment using Qwen 2.5 7B. Part 4 levels up — we apply LoRA to a Gemma 4 MoE model.
Why Gemma 4? Three reasons:
- MoE architecture: 26B total params, only 3.8B active. Inference cost is 4B-class, but performance is Arena #6
Related Posts

AI Tools & Agents
Self-Evolving AI Agents — The New Paradigm of 2026
GenericAgent, Evolver, Open Agents — comparing 3 self-evolving agent frameworks that learn, adapt, and grow without human coding.

AI Tools & Agents
Build Your Own LLM Knowledge Base — A Karpathy-Style Knowledge System
Complete guide to building a permanent personal knowledge system with Obsidian + Claude Code. Wiki + Memory dual-axis architecture.

AI Tools & Agents
Why Karpathy's CLAUDE.md Got 48K Stars — And How to Write Your Own
One markdown file raised AI coding accuracy from 65% to 94%. Analyzing Karpathy's 4 rules and practical writing guide.