www.lesswrong.com/posts/dKpC6wHFqDrGZwnah/ais-can-now-often-do-massive-easy-to-v...
1 correction found
This is because for right-skewed distributions, the median of a sum is greater than the sum of the medians.
This mathematical rule is false in general. Medians are not additive, and even for right-skewed distributions the median of a sum can be either greater or less than the sum of the individual medians.
Full reasoning
This sentence states a general mathematical rule, but that rule does not hold.
Two key points:
- Medians are not linear/additive. Unlike expectations, there is no general identity such as
median(X+Y) = median(X) + median(Y). - Right-skewness does not fix this. Even if each summand is right-skewed, the median of the sum need not exceed the sum of the medians.
A concrete counterexample shows the claim is false:
- Let
Xtake values0with probability0.49,1with probability0.50, and100with probability0.01. - This distribution is right-skewed: its mean is
1.5, which is greater than its median1. - Now take independent
XandYwith this same distribution. median(X) = 1andmedian(Y) = 1, so the sum of the medians is2.- But
P(X+Y <= 1) = P(0,0)+P(0,1)+P(1,0) = 0.49^2 + 2(0.49)(0.50) = 0.7301, whileP(X+Y <= 0) = 0.49^2 = 0.2401. - So the median of
X+Yis1, not greater than 2.
That directly contradicts the article's statement.
What is true is weaker: for skewed distributions, the median of a sum/average can differ from the sum/average of medians, and for non-symmetric distributions the median of (X+Y)/2 is generally a different quantity (the pseudomedian), not the ordinary median. So the inequality used here is not a valid general justification for median(B) - median(A) > median(B-A).
2 sources
- MIT OpenCourseWare — Introduction to Statistical Methods in Economics, Lecture 12
"For medians this is no longer true as the following example shows ... The median of X1 and X2 is zero, however median(Y)=1 ... 0 + 0 = median(X1) + median(X2)."
- University of Minnesota Stat 5102 Notes: Nonparametric Tests and Estimates
"When the distribution of the Xi is not symmetric, the median of the Walsh averages does not estimate the median but the median of the distribution of a typical Walsh average (Xi + Xj)/2 ... someone coined the term pseudomedian."