www.lesswrong.com/posts/LXQBcztrWKhtcgQfJ/current-activation-oracles-are-hard-to...
1 correction found
Activations were taken at the 50% layer (layer 32/64 for 32B, layer 16/32 for 8B) based on a layer sweep.
Qwen3-8B is not a 32-layer model. Its official config lists 36 hidden layers, so describing the 8B model’s halfway point as “layer 16/32” is incorrect.
Full reasoning
The post says the 8B model's 50% layer is "layer 16/32," but the official Hugging Face config for Qwen/Qwen3-8B lists num_hidden_layers: 36, not 32.
So while the 32B model's "32/64" description matches the official config, the 8B model's denominator is wrong. If the experiments really used the 50%-depth layer, it would be based on a 36-layer model rather than a 32-layer one.
This matters because the post presents these exact layer counts as implementation details tied to the layer sweep and the evaluation setup.
2 sources
- Qwen/Qwen3-8B config.json
"model_type": "qwen3", "num_hidden_layers": 36
- Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers
For evaluation, we use activations from 50% depth (see Appendix C.5 for ablations).