LessWrong April 15, 2026 at 12:59 AM

about-30-of-humanity-s-last-exam-chemi...

2 corrections found

Claim

finding an error rate of 18% instead

Correction

The revised HLE paper reports an expert disagreement rate of about 18%, not a confirmed 18% error rate.

Full reasoning

The cited HLE revision does not say it found that 18% of questions were wrong. It says a targeted peer review of a biology/chemistry/health subset found an expert disagreement rate of approximately 18%. That is a different claim: disagreement among reviewers is not the same as a verified error rate.

This distinction matters because the same passage explains that under a different review rule—flagging a question whenever a single reviewer dissents—the rate rises from 18% to 25%. That shows the figure is about the review methodology and reviewer disagreement, not a direct measurement of how many questions are objectively incorrect.

So the post's paraphrase overstates what Scale AI reported: the source supports an 18% disagreement rate, not an 18% error rate.

2 sources

A benchmark of expert-level academic questions to assess AI capabilities - PMC
A targeted peer review on a biology, chemistry and health subset, proposed in ref. 63, found an expert disagreement rate of approximately 18%. ... To illustrate, if we were to adopt a single-reviewer methodology in which a question is flagged based on just one dissenting expert, the disagreement rate on the aforementioned health-focused subset jumps from 18% to 25%...
A benchmark of expert-level academic questions to assess AI capabilities
Expert disagreement rate. ... subset, proposed in ref. 63, found an expert disagreement rate of approximately 18%. ... on the aforementioned health-focused subset jumps from 18% to ...

Claim

no properties have been measured

Correction

This is too broad. Oganesson’s chemical and bulk physical properties have not been directly measured, but at least some properties—such as its isotope’s half-life—have been measured or estimated experimentally.

Full reasoning

The statement overreaches by saying no properties of oganesson have been measured. While its chemical and bulk physical properties are largely unmeasured because so few atoms have been synthesized, researchers have still determined experimental nuclear properties from its observed decay.

For example, authoritative reference sources report a measured/estimated half-life for oganesson-294 (on the order of a millisecond). A half-life is a property of the nuclide. So the accurate claim would be narrower: oganesson’s ordinary physical and chemical properties have not been directly characterized, not that no properties at all have been measured.

3 sources

Oganesson | Og (Element) - PubChem
This produced oganesson-294, an isotope with a half-life of about 0.89 milliseconds (0.00089 seconds)... ### 7.1 Atomic Mass, Half Life, and Decay
Oganesson | Atomic Number, Atomic Mass, Electron Configuration, Synthesis, Uses, Half-Life, Relativistic Effects, & Facts
The extremely short half-life of the few synthesized atoms has severely limited experimental data... The longest-lived known isotope, oganesson-294, has an estimated half-life... of less than a millisecond.
Oganesson - American Chemical Society
Oganesson-294’s extremely short half-life precludes measurements of its physical and chemical properties.

Model: OPENAI_GPT_5 Prompt: v1.16.0