Foundation Models

Confidence 0.80 · 2 sources · last confirmed 2026-04-30

Large pretrained models — typically transformer-based — that serve as the substrate for downstream AI applications via prompting, fine-tuning, or retrieval. Foundation models drive most modern generative-ai capability and are the locus of the “frontier” debate (capability, safety, transparency).

Working definition

A foundation model is a model trained on broad data at scale that can be adapted to a wide range of downstream tasks. The term, coined by Stanford’s Center for Research on Foundation Models (CRFM), foregrounds the adaptation role — the model is a foundation other things are built on, not the end product.

Frontier model is a near-synonym foregrounding capability (top of the leaderboard). The two terms diverge in policy/regulatory contexts: “frontier” often implies the regulator-relevant subset of foundation models above a capability threshold.

Key claims

Industry has decisively taken the frontier

~90% of notable AI models in 2024 came from industry (vs. 60% in 2023); 91.18% in 2025 per the AI Index 2026 update. Academia remains the leading source of highly cited research, but not of new notable models.
2024: U.S. 40 notable models, China 15, Europe 3.
2025 (AI Index 2026): U.S. 59, China 35, South Korea 8, Europe 2.
Top organizations 2025: OpenAI 20, Google 14, Alibaba 11, Anthropic 7, xAI 5, DeepSeek 4, LG AI Research 4, Meta 4, Tsinghua University 4 (only academic institution in the top 9), ByteDance / Moonshot / Nvidia 3 each.
The U.S.-China performance gap has effectively closed: DeepSeek-R1 briefly matched the top U.S. model in Feb 2025; as of March 2026, Anthropic’s top model leads by just 2.7%.

Compute is scaling fast

Training compute for notable models doubles every ~5 months.
Dataset size for training LLMs doubles every ~8 months.
Power required for training doubles annually.
Global AI compute capacity reached 17.1M H100-equivalents by 2025, growing 3.3× per year since 2022 (AI Index 2026 §1.2).
Nvidia: >60% of total compute; Google + Amazon supply most of the remainder; Huawei holds small but growing share.
U.S. hosts 5,427 AI data centers — more than 10× any other country.
TSMC fabricates almost every leading AI chip; TSMC-U.S. expansion began operating in 2025.
AI data center power capacity: 29.6 GW — comparable to New York state at peak demand.
Carbon emissions trajectory:
- AlexNet (2012): 0.01 tons
- GPT-3 (2020): 588 tons
- GPT-4 (2023): 5,184 tons
- Llama 3.1 405B (2024): 8,930 tons
- Grok 4 (2025): 72,816 tons (AI Index 2026)
- (Reference: average American = 18 tons/year.)
Annual GPT-4o inference water use alone may exceed the drinking water needs of 1.2 million people.

Disclosure is dropping (AI Index 2026)

Training code, dataset sizes, parameter counts increasingly withheld for the most resource-intensive systems including those from OpenAI, Anthropic, Google.
Reported parameter counts have stayed near 1 trillion for three years as disclosure dropped — but training compute (estimable independently from hardware) has continued to rise.
OLMo 3.1 Think 32B with ~90× fewer parameters than Grok 4 achieves comparable benchmark results via pruning, deduplication, curation alone — evidence that data quality and post-training matter as much as scale.

Open-weight catching up to closed-weight

Performance gap between top closed-weight and top open-weight models on the Chatbot Arena Leaderboard: 8.0% in early Jan 2024 → 1.7% by Feb 2025 on some benchmarks.
The frontier is also tightening overall: top-2 model gap = 0.7%, top-10 gap = 5.4% (down from 4.9% / 11.9% the year before).

U.S.-China performance gap narrowing fast

End-2023 vs. end-2024 gaps on major benchmarks:
- MMLU: 17.5pp → 0.3pp
- MMMU: 13.5pp → 8.1pp
- MATH: 24.3pp → 1.6pp
- HumanEval: 31.6pp → 3.7pp

Transparency improving

Foundation Model Transparency Index (CRFM): avg score among major developers 37% (Oct 2023) → 58% (May 2024). Substantial progress, ~42% gap remaining. See responsible-ai.

Smaller is mighty

142× model-size reduction for the same MMLU >60% threshold in two years: PaLM (540B params, 2022) → Phi-3-mini (3.8B, 2024).

Notable foundation model series (mentioned via AI Index 2025)

To be promoted to standalone entity pages when discussed in depth in another source. Currently noted here as a roster:

OpenAI: GPT-4, GPT-4o, o1, o3 (test-time compute reasoning), SORA (video).
Google DeepMind: Gemini family (Gemini-1.5-Flash-8B is the 280×-cost-reduction marker), Veo 2 (video).
Anthropic: Claude 3 family (incl. Sonnet — implicit-bias study).
Meta: Llama 3.1 405B (the 8,930-ton-CO2 marker), Movie Gen (video).
Microsoft: Phi-3-mini (the 3.8B-param-MMLU marker).
Mistral AI: French open-source.
xAI: Grok family.

Debates / contradictions

“Frontier” vs. “foundation” framing. “Frontier” emphasizes capability gap; “foundation” emphasizes adaptation role. Different policy/regulation implications — frontier-model bills target capability thresholds; foundation-model bills target the broader pretraining-then-adapt pattern.
Compute-scaling sustainability. Data-commons shrinkage (see responsible-ai) plus rising energy demands (driving nuclear-energy partnerships — Microsoft’s Three Mile Island, Google’s SMRs, Amazon’s SMRs) raise structural questions about the 5-month-compute-doubling trajectory continuing.
Open-weight closing the gap. As open-weight performance catches closed-weight, the policy logic for restricting model release weakens — but so does the commercial moat for closed-weight providers. Open question how 2025–2026 plays out.

generative-ai — the dominant use of foundation models today
ai-benchmarks — how foundation-model capabilities are evaluated
responsible-ai — the transparency, safety, and governance overlay
enterprise-ai-adoption — the deployment context

AI-Wiki

Explorer

foundation-models

Foundation Models

Working definition

Key claims

Industry has decisively taken the frontier

Compute is scaling fast

Disclosure is dropping (AI Index 2026)

Open-weight catching up to closed-weight

U.S.-China performance gap narrowing fast

Transparency improving

Smaller is mighty

Notable foundation model series (mentioned via AI Index 2025)

Debates / contradictions

Graph View

Table of Contents

Backlinks

AI-Wiki

Explorer

foundation-models

Foundation Models

Working definition

Key claims

Industry has decisively taken the frontier

Compute is scaling fast

Disclosure is dropping (AI Index 2026)

Open-weight catching up to closed-weight

U.S.-China performance gap narrowing fast

Transparency improving

Smaller is mighty

Notable foundation model series (mentioned via AI Index 2025)

Debates / contradictions

Related concepts

Graph View

Table of Contents

Backlinks