X April 8, 2026 at 06:41 PM

2041817479488389324

1 correction found

Claim

there is in all likelihood no model deployed with more active parameters than 2020 GPT-3.

Correction

This is contradicted by Meta’s own documentation: Meta AI on WhatsApp and meta.ai can use Llama 405B, and Meta’s Llama 3 paper says that model is a dense 405B-parameter transformer. Because GPT-3 had 175B parameters, a deployed dense 405B model has more active parameters than 2020 GPT-3.

Full reasoning

OpenAI’s original GPT-3 paper states that GPT-3 was trained as a 175 billion parameter autoregressive language model.

Meta then announced that Meta AI on WhatsApp and meta.ai can use Llama 405B: “You now have the option to use our largest and most advanced open-source model inside of Meta AI on WhatsApp and meta.ai. Llama 405B…” That means the model is not merely released for download; it is deployed inside a live consumer product.

Meta’s Llama 3 technical paper further states: “Our largest model is a dense Transformer with 405B parameters.” For a dense model, all parameters are active during inference, so its active parameter count is 405B.

That gives a direct counterexample to the post’s claim: a deployed model (Llama 3.1 405B in Meta AI) has 405B active parameters, which is well above GPT-3’s 175B. Therefore the statement that there is “in all likelihood no model deployed with more active parameters than 2020 GPT-3” is incorrect.

3 sources

Language Models are Few-Shot Learners
Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters...
Meta AI is Now Multilingual, More Creative and Smarter
You now have the option to use our largest and most advanced open-source model inside of Meta AI on WhatsApp and meta.ai. Llama 405B's improved reasoning capabilities...
The Llama 3 Herd of Models
Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens.

Model: OPENAI_GPT_5 Prompt: v1.16.0