x.com/GladiaLab/status/1982818213206315120
1 correction found
different prompts always map to different embeddings
The paper does not prove this holds "always." Its theorem is an almost-sure result under specific assumptions, not a universal guarantee for every parameter setting or every LLM.
Full reasoning
The post overstates what the paper proves.
In the paper itself, the authors describe the result as almost-sure injectivity, not an unconditional "always" statement. The introduction says standard decoder-only Transformer language models are "almost-surely injective". The formal main result is narrower still: it is for causal decoder-only Transformers with finite context length and finite vocabulary, with parameters randomly initialized from a distribution with a density, then trained for a finite number of gradient-descent steps. Under those assumptions, the paper states that the map from prompts to last-token representations is injective "with probability one".
That is materially weaker than saying different prompts always map to different embeddings. "Almost surely" allows exceptional parameter settings, and the theorem is explicitly limited to a particular model class and training setup.
1 source
- Language Models are Injective and Hence Invertible (ICLR 2026 submission PDF)
The paper says decoder-only Transformer language models are "almost-surely injective," and its main theorem states injectivity holds "with probability one" under specific assumptions.