www.lesswrong.com/posts/8YnHuN55XJTDwGPMr/a-gentle-introduction-to-sparse-autoen...
3 corrections found
multi-layer perceptions (MLPs)
MLP stands for “multilayer perceptron,” not “multi-layer perceptions.”
Full reasoning
The post expands the acronym MLP as “multi-layer perceptions,” but in machine learning MLP is the standard abbreviation for multilayer perceptron, a type of feedforward neural network.
This isn’t a matter of terminology preference: the “P” in MLP refers to perceptron, the historical name for a (simple) neural network unit/network, not “perception(s).”
1 source
- Multilayer perceptron - Wikipedia
“In deep learning, a multilayer perceptron (MLP) is a kind of modern feedforward neural network…”
The only way for n vectors in n- dimensional space to be linearly independent is if they’re all orthogonal.
In R^n, n vectors can be linearly independent without being orthogonal; orthogonality is sufficient but not necessary for linear independence.
Full reasoning
This claim is false in basic linear algebra.
- Orthogonality implies linear independence (for nonzero vectors), but the reverse direction does not hold.
- A simple counterexample in (\mathbb{R}^2): ((1,0)) and ((1,1)) are linearly independent (neither is a scalar multiple of the other) but not orthogonal (their dot product is (1)).
Jim Hefferon’s open linear algebra textbook explicitly states that “not every basis … has mutually orthogonal vectors” and gives an explicit basis for (\mathbb{R}^2) whose members “are not orthogonal.” Since a basis is (by definition) linearly independent, that directly contradicts the post’s statement that linear independence (for n vectors in n-dimensional space) requires orthogonality.
2 sources
- Linear Algebra (4th Edition) — Jim Hefferon (PDF)
“Of course, the converse of Corollary 2.3 does not hold— not every basis of every subspace of R^n has mutually orthogonal vectors. … Example The members … of this basis for R^2 are not orthogonal.”
- Linear Algebra (4th Edition) — Jim Hefferon (PDF)
“1.1 Definition A basis for a vector space is a sequence of vectors that is linearly independent and that spans the space.”
Created in the 1990s, autoencoders were initially designed for dimensionality reduction and compression.
Autoencoder-style “auto-association” networks used for compression/dimensionality reduction were published in 1988, so they were not “created in the 1990s.”
Full reasoning
The post dates the creation of autoencoders to the 1990s, but published work describing autoencoder-style networks for compression/dimensionality reduction exists earlier.
A 1988 paper by Bourlard & Kamp discusses multilayer perceptrons in auto-association mode specifically as candidates for “data compression or dimensionality reduction.” This is the core autoencoder setup (learn an internal representation that reconstructs the input). That places autoencoder-style methods in the literature before the 1990s.
A later historical review (“Autoencoders reloaded,” 2022) explicitly discusses the 1988 work and notes the terminology shift: “Auto-associative multilayer perceptrons are now called autoencoders.” Together, these sources directly contradict the claim that autoencoders were created in the 1990s.
2 sources
- Auto-association by multilayer perceptrons and singular value decomposition — Bourlard & Kamp (1988) (PubMed)
“The multilayer perceptron, when working in auto-association mode, is sometimes considered as an interesting candidate to perform data compression or dimensionality reduction…”
- Autoencoders reloaded — Bourlard & Kabil (2022) (PMC, open access)
“This work is based upon (Bourlard and Kamp 1988)… Auto-associative multilayer perceptrons are now called autoencoders (AE)…”