Substack March 14, 2026 at 01:55 AM

ainews-the-high-return-activity-of

1 correction found

Claim

The Qwen3.5-9B model is noted for its impressive performance, benchmarking around the level of GPT-3’s 120B model

Correction

OpenAI’s GPT-3 was introduced as a 175B-parameter model family; the official paper describes the flagship GPT-3 as 175B and does not define a 'GPT-3 120B' model.

Full reasoning

The comparison is misstated because GPT-3 is not a 120B model in OpenAI’s official paper. The GPT-3 paper explicitly describes GPT-3 as a 175 billion parameter language model, and its reported model family sizes culminate at 175B, not 120B.

So even if the intended comparison was to some other 120B-class model, the specific wording "GPT-3’s 120B model" is inaccurate. It attributes a 120B parameter count to GPT-3 that does not match OpenAI’s published model description.

2 sources

[AINews] The high-return activity of raising your aspirations for LLMs
The Qwen3.5-9B model is noted for its impressive performance, benchmarking around the level of GPT-3’s 120B model, which is surprising given its smaller size.
Language Models are Few-Shot Learners
We train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model.

Model: OPENAI_GPT_5 Prompt: v1.16.0