Summary
DeepSeek's R1 distillation technique — training smaller models on reasoning traces from the full R1 model — spawned a wave of open-source reasoning models in early 2026, dramatically lowering the cost of chain-of-thought capabilities and challenging the assumption that reasoning required massive compute.
What Happened
Following DeepSeek R1's January 2025 release, the open-source community rapidly adopted its distillation methodology. By early 2026, dozens of reasoning-capable models had been produced by fine-tuning existing open-weight models on R1's reasoning traces. Models as small as 7B parameters demonstrated meaningful chain-of-thought reasoning, a capability previously requiring 70B+ parameter models or proprietary systems like OpenAI's o1.
Why It Matters
The distillation wave demonstrated that reasoning capabilities could be transferred efficiently between models, undermining the moat of companies investing billions in training reasoning models from scratch. It reinforced DeepSeek's broader thesis: that clever training methodology matters more than raw compute expenditure.
§ How to read the metadata
- Landmark
- Fundamentally alters the trajectory; 2–5 per year.
- Major
- Meaningfully shifts the landscape; 2–4 per month.
- Notable
- Worth documenting; significance can be upgraded later.
- Confidence
- High = primary sources corroborate. Medium = credible secondary only. Low = provisional. Disputed = credible sources disagree.
- Contestation
- Uncontested = no formal challenge. Contested = at least one challenge open. Superseded = replaced by a later entry. Unresolved = dispute still open.