All corrections
X February 28, 2026 at 12:21 AM

x.com/robbensinger/status/2027533660048789868

1 correction found

1
Claim
@AnthropicAI quietly dropping all of its safety commitments this week
Correction

Anthropic did not “drop all of its safety commitments” that week; as of Feb 24, 2026 it publicly maintained and added multiple safety commitments (e.g., publishing a Frontier Safety Roadmap and model Risk Reports) under its updated Responsible Scaling Policy framework.

Full reasoning

The post claims Anthropic “dropp[ed] all of its safety commitments” during the week of the post (posted Feb 27, 2026).

However, Anthropic’s own public policy updates from that same week show ongoing—and newly stated—safety commitments rather than a blanket abandonment:

  • Anthropic’s Responsible Scaling Policy (RSP) updates page (last updated Feb 24, 2026) describes Version 3.0 as a rewrite that “involves the publication of Frontier Safety Roadmaps with detailed safety goals, and Risk Reports that quantify risk across all our deployed models.” That is explicitly a set of continuing/public-facing safety commitments.

  • Anthropic’s Frontier Safety Roadmap (goals listed “as of Feb 19th, 2026”) contains numerous forward-looking commitments phrased as “we will …” with target dates (e.g., selecting and beginning specific security projects by April 1, 2026).

Because Anthropic publicly articulated multiple safety commitments in this timeframe, the statement that it dropped “all” safety commitments is contradicted by Anthropic’s own contemporaneous documentation.

Note: Whether Anthropic weakened or strengthened specific aspects of its safety program can be debated, but the absolute claim “dropping all” is refuted by the existence of these explicit, current commitments.

2 sources
  • Responsible Scaling Policy Updates | Anthropic

    The page is “Last updated Feb 24, 2026” and states: “Version 3.0 is a comprehensive rewrite… This new version of the RSP involves the publication of Frontier Safety Roadmaps with detailed safety goals, and Risk Reports that quantify risk across all our deployed models.”

  • Frontier Safety Roadmap | Anthropic

    The roadmap lists goals “as of February 19th, 2026” and includes explicit commitments with target dates, e.g. under Security: “By April 1, 2026, we will have selected and begun 1-3 project(s)… and established concrete further goals and timelines for each.”

Model: OPENAI_GPT_5 Prompt: v1.6.0