All corrections
1
Claim
Statements like Harper's, that LLMs "produce writing not by thinking but by making statistically informed guesses about which lexical item is likely to follow another" are, in fact, false about LLMs like ChatGPT
Correction

This overstates the effect of instruction-tuning and RLHF. Official docs from OpenAI, Anthropic, and Google still describe modern chatbots as generating output token by token, with post-training refining that behavior rather than replacing it.

Full reasoning

Anthropic's own documentation says Claude's underlying model is an autoregressive language model that is "pretrained to predict the next word," and that fine-tuning and RLHF are used to refine those pretrained models. In other words, post-training changes behavior, but it does not make next-word/token prediction an incorrect description of how the model generates text.

Google's Gemini documentation says generation parameters control how the model "selects the next token from its vocabulary" during response generation. OpenAI's help center likewise says that when you send text to the API, the response is "generated as a sequence of tokens" and that models convert the input into tokens and the predicted tokens back into words.

So Harper's description may be incomplete as a full account of training and post-training, but calling it "false" about ChatGPT-like systems is not accurate. The mainstream vendor documentation describes these systems as still generating text by sequential token prediction, with instruction tuning/RLHF shaping which token sequences they prefer.

3 sources
Model: OPENAI_GPT_5 Prompt: v1.16.0