Substack March 4, 2026 at 11:47 PM

on-recursive-self-improvement-part

1 correction found

Claim

we have not seen any models trained on Blackwell-generation chips

Correction

Public MLPerf Training v4.1 results included training benchmark submissions using NVIDIA’s Blackwell (B200) accelerators, including the GPT‑3 pre-training benchmark—so models have been trained on Blackwell hardware publicly.

Full reasoning

The post states that “we have not seen any models trained on Blackwell-generation chips.” However, by November 13, 2024—well before this Substack post was published on February 5, 2026—MLCommons publicly announced MLPerf Training v4.1 results that included preview-category submissions using NVIDIA “Blackwell” B200 hardware.

MLPerf Training is explicitly a training benchmark suite (i.e., it trains models to a target quality metric). MLCommons’ announcement notes that v4.1 includes preview submissions using Blackwell B200 accelerators, and the v4.1 suite includes generative AI training workloads such as GPT‑3.

NVIDIA’s own technical blog post about the same MLPerf Training v4.1 round further states that NVIDIA made its first MLPerf Training submissions using the Blackwell platform, including performance boosts for GPT‑3 pre-training (and Llama 2 70B LoRA fine-tuning). This directly contradicts the claim that we have not seen models trained on Blackwell-generation chips.

So, regardless of whether frontier proprietary models’ full training runs were publicly disclosed, there have been publicly reported training runs of models/benchmarks (e.g., GPT‑3 pre-training benchmark) on Blackwell accelerators—meaning the blanket statement “we have not seen any models trained on Blackwell-generation chips” is factually incorrect.

2 sources

New MLPerf Training v4.1 Benchmarks Highlight Industry’s Focus on New Systems and Generative AI Applications - MLCommons (Nov 13, 2024)
MLCommons announces MLPerf Training v4.1 results, including preview-category submissions using next-generation accelerators, explicitly listing “NVIDIA ‘Blackwell’ B200 accelerator (preview)” and noting the suite’s generative-AI training workloads (including GPT3).
NVIDIA Blackwell Doubles LLM Training Performance in MLPerf Training v4.1 | NVIDIA Technical Blog (Nov 13, 2024)
NVIDIA states it made its first MLPerf Training submissions using the Blackwell platform, highlighting per-GPU performance boosts of “2x for GPT-3 pre-training” and discussing the MLPerf LLM pre-training benchmark based on the GPT-3 model.

Model: OPENAI_GPT_5 Prompt: v1.6.0