23 September 2016
Expected value: equivalent to the mean of the infinite set \[E(Y) = \sum_y y \cdot p(y)\] Variance: equivalent to the mean square error \[V(Y) = \sum_y (y-E(Y))^2\cdot p(y)\]
We make an experiment
What can we learn from the sample?
Fortunate each \(y_i\) has expected value \(E(Y)\)
But probably not exactly equal
Chebyshev’s inequality: \[\Pr\left(|y_i-E(Y)|\geq V(Y) \cdot k\right)\leq \frac{1}{k^2}\] The probability that \(E(Y)\) is outside \(y_i\pm V(Y) \cdot k\) is \(1/k^2\)
\[E(\bar{\mathbf{y}})=E(Y)\] \[V(\bar{\mathbf{y}})=\frac{V(Y)}{n}\] \[\Pr\left(|\bar{\mathbf{y}}-E(Y)|\geq V(Y) \cdot \frac{k}{\sqrt{n}}\right)\leq \frac{1}{k^2}\]
\[\Pr\left(|\bar{\mathbf{y}}-E(Y)|\geq V(Y) \cdot \frac{k}{\sqrt{n}}\right)\leq \frac{1}{k^2}\] The probability that \(E(Y)\) is outside \[\bar{\mathbf{y}}\pm V(Y) \cdot \frac{k}{\sqrt{n}}\] is \(1/k^2\)
k | |
---|---|
1 | 1 |
1.414 | 0.5 |
3 | 0.1111 |
4 | 0.0625 |
10 | 0.01 |
Here intervals are narrower
But we need to know \(V(Y)\)
k | |
---|---|
1 | 0.1587 |
1.414 | 0.07865 |
3 | 0.00135 |
4 | 3.167e-05 |
10 | 7.62e-24 |
We find that \[E(\mathrm{S}_n(\bar{y}, \mathbf{y})) = \frac{n-1}{n}V(Y)\] so the variance of the sample is biased
Instead we use the variance of the population \[\mathrm{S}_{n-1}(\mathbf{y})=\frac{1}{n-1}\sum_i (y_i-\bar{\mathbf{y}})^2\]