Nov 23, 2020
Why Dropout Shifts Variance: Dropout sets some activations to 0, this naturally alters the variance. At test time, this is not done, so the variance is not altered.
How this Shift Affects the Mean Activation: ReLU(0) is 0, so dropped values do not contribute to the mean activation. Without dropout, any value > 0 will pass through the ReLU activation and, thus, contribute to increasing the mean activation.
What I wonder is if one can estimate correction weights to alleviate these problems.