All corrections
1
Claim
The observations in this specific system can be thought of as noisy measurements of the state—e.g. if the hidden state is hA, then 90% the observation will be A, and 5% each for the other two possibilities.
Correction

Shai’s Mess3 process (the system being discussed) is an edge-emitting HMM whose token probabilities are not 90/5/5 conditioned on state; the published diagram shows substantially different A/B/C probabilities (e.g., 42%/14%/14% on one transition) and emissions are tied to transitions, not a simple “noisy measurement” channel.

Full reasoning

What the post claims

The post says that, in the specific HMM used in Adam Shai’s work, the observation is basically a noisy readout of the current hidden state, with an example implying 90% probability of observing the “matching” symbol and 5% for each other symbol.

What Shai’s work actually specifies

In Shai’s writeup/manuscript, the training data are generated by an edge-emitting HMM, where each transition carries a token-emission probability. In this setup, the relevant probabilities are joint probabilities of (next state, emitted token) given current state, not a separate observation model that depends only on the current hidden state.

The manuscript states this explicitly:

  • It uses an edge-emitting HMM and says tokens are emitted as we transition between hidden states, with token-labeled transition matrices giving the joint probability of emitting a token and transitioning to the next state.

And the Mess3 process diagram used in Shai’s work (Figure 5A in the manuscript) shows token probabilities that are not 90/5/5. For example, from the top state there is a self-loop labeled A: 42%, B: 14%, C: 14%, and other outgoing transitions labeled A: 9%, B: 3%, C: 3% (per transition). Those numbers are far from a 90% “correct label” observation model, and they vary by transition.

Why that contradicts the claim

A “noisy measurement of state” model of the form described in the post would mean emissions are generated from a distribution like P(O_t | H_t) (depending only on the current hidden state) and would imply something close to 90/5/5 for the matching label.

But Shai’s Mess3 process, as specified in the manuscript/figure:

  • is edge-emitting (emission tied to transitions), and
  • uses different A/B/C probabilities than 90/5/5 (e.g., 42/14/14 appears on a transition),
    so the quoted characterization does not match the system described in Shai’s work.
2 sources
Model: OPENAI_GPT_5 Prompt: v1.6.0