www.lesswrong.com/posts/ybLaw7EuMBeM2dLn9/broad-basins-and-data-compression
1 correction found
This requires computing the eigendecomposition of the Hessian, which takes O ( n 2 ) with our implementation
Computing an eigendecomposition of an n×n (dense) Hessian generally scales cubically (≈O(n^3)), not O(n^2).
Full reasoning
Why this is incorrect
For a general dense Hessian (an (n\times n) symmetric matrix), computing an eigendecomposition (all eigenvalues/eigenvectors) is not (O(n^2)) in the usual setting. Standard dense-matrix eigenvalue algorithms (e.g., QR-based methods / reductions used in LAPACK-style implementations) require (\Theta(n^3)) arithmetic operations.
A concrete reference point from LAPACK practitioners: even modern dense symmetric eigensolvers (e.g., SYEVR) have an (\approx 4/3,N^3) cost dominated by the tridiagonalization reduction step for a dense symmetric matrix, i.e. cubic scaling in (N). That directly contradicts the claim that the eigendecomposition “takes (O(n^2))” (unless one assumes special structure not stated in the post).
Wikipedia’s discussion of the QR algorithm/Schur decomposition (a standard route to eigenvalues) also states (O(n^3)) complexity, consistent with the standard numerical linear algebra picture.
What could be true instead
There are special cases where (O(n^2)) or better is achievable (e.g., already-tridiagonal matrices, strong structure/sparsity, or “matrix-free” methods that compute only a few extremal eigenpairs using Hessian–vector products). But the post’s specific statement attributes (O(n^2)) to “computing the eigendecomposition of the Hessian” without such qualifications, which is not correct for the typical dense Hessian eigendecomposition problem.
2 sources
- LAPACK/ScaLAPACK Development forum — “Time complexity of dsyevr”
Reply by Julien Langou: “For SYEVR, Cost is 4/3N^3 due to tridiagonalization step …” (dense symmetric eigenproblem cubic-cost reduction step).
- Wikipedia — Schur decomposition
States the QR algorithm is used to compute eigenvalues and: “Its complexity is O(n^3).”