Highest kidney cancer rates in US (1980–1989) were in rural areas
May 3rd, 2019
Highest kidney cancer rates in US (1980–1989) were in rural areas
Why? Maybe…
Why? Maybe…
Something is wrong, of course. The rural lifestyle cannot explain both very high and very low incidence of kidney cancer.
What is the relationship between sample average and population average?
Can we learn the population average from the sample average?
Each square is a sample. Volume is fixed. The cell count is an average of cell counts of some squares.
We want population cell density
We have a sample of cell densities
N <- 4800 pop_LD <- sample(rep(0:9,N/10)) pop_MD <- sample(rep(0:29,N/30)) pop_HD <- sample(rep(0:79,N/80)) pop_HD
[1] 64 77 47 76 43 17 0 47 6 14 28 46 69 18 43 17 [17] 8 6 13 44 58 21 8 3 31 36 75 74 57 12 14 0 [33] 72 21 23 56 37 6 79 14 0 27 22 34 22 53 76 65 [49] 7 72 25 32 59 61 6 77 61 58 66 19 59 67 49 40 [65] 36 35 15 69 47 72 63 38 70 36 77 53 56 66 21 46 [81] 64 24 76 42 66 41 55 49 69 71 1 1 38 68 33 71 [97] 1 16 77 57 50 62 15 62 4 19 28 57 73 7 62 3 [113] 52 9 20 43 47 54 77 58 4 33 27 12 44 60 23 35 [129] 21 68 16 49 65 63 65 59 46 29 42 6 34 18 70 68 [145] 54 32 73 57 23 18 61 20 39 54 77 54 76 71 72 20 [161] 47 53 73 31 2 12 34 7 18 76 53 50 24 4 36 16 [177] 37 40 18 27 47 31 7 18 36 38 63 9 19 1 51 67 [193] 29 54 27 56 75 64 37 21 73 56 31 12 36 76 75 63 [209] 73 63 66 50 29 33 65 23 63 53 61 20 23 55 30 0 [225] 10 8 20 1 61 70 53 45 3 41 49 48 59 68 55 77 [241] 74 72 2 35 65 51 71 7 54 62 67 31 72 28 20 64 [257] 55 4 56 16 67 14 13 57 60 32 36 39 74 75 76 24 [273] 16 76 69 21 66 14 60 54 19 48 36 66 47 32 3 64 [289] 25 14 59 6 57 40 39 79 72 1 47 64 18 42 31 38 [305] 13 42 16 75 68 32 7 61 23 18 74 39 73 7 0 53 [321] 69 40 50 75 70 77 28 71 55 42 15 72 78 14 26 47 [337] 49 55 77 65 77 14 73 21 60 76 53 16 38 35 75 55 [353] 46 16 51 73 65 27 6 39 66 39 7 54 4 76 23 79 [369] 7 48 3 16 21 24 61 54 75 41 29 1 78 13 0 48 [385] 22 70 33 31 18 2 71 6 18 42 60 54 38 19 34 43 [401] 6 67 67 5 72 45 9 4 2 15 51 69 27 15 66 54 [417] 75 62 2 54 75 42 66 71 55 4 77 20 76 50 13 47 [433] 46 53 9 34 26 59 34 20 22 46 23 65 0 39 42 36 [449] 11 36 10 34 37 26 66 30 55 48 38 8 64 34 28 43 [465] 17 61 37 7 11 47 26 48 43 30 33 3 1 68 72 15 [481] 72 28 66 10 34 58 56 27 47 17 45 17 38 51 40 49 [497] 19 67 73 35 18 73 68 28 11 29 53 22 65 45 26 33 [513] 61 2 7 14 60 8 15 48 56 59 35 44 52 59 44 24 [529] 77 33 59 41 52 5 23 66 61 15 64 65 52 62 13 4 [545] 53 73 46 77 43 9 43 32 35 76 10 29 44 29 27 54 [561] 30 62 48 46 63 72 0 10 58 42 12 73 52 47 38 6 [577] 77 44 23 27 60 65 6 41 62 58 52 6 66 11 78 25 [593] 17 48 64 43 23 42 37 48 40 57 22 33 6 38 71 25 [609] 1 46 20 7 42 70 21 15 38 54 79 72 54 3 78 60 [625] 52 8 50 48 70 4 43 59 38 38 78 7 12 29 45 12 [641] 29 19 18 15 38 52 50 13 38 24 47 49 34 72 73 67 [657] 8 6 12 59 76 76 10 21 70 58 14 46 54 14 16 79 [673] 60 64 45 2 32 46 42 74 68 72 74 26 79 43 8 14 [689] 4 54 52 27 29 33 48 68 77 64 46 55 33 68 57 25 [705] 72 12 58 60 44 68 7 79 5 62 55 49 46 29 57 18 [721] 5 42 33 68 25 0 35 29 47 51 11 14 12 20 50 77 [737] 19 30 64 52 58 59 30 65 4 44 60 70 6 12 74 31 [753] 27 52 1 55 27 14 32 23 71 10 73 33 1 67 59 15 [769] 62 40 7 18 48 69 22 55 75 52 32 62 44 78 70 0 [785] 62 13 77 9 66 78 24 8 74 68 65 27 77 48 12 2 [801] 68 72 56 72 12 23 42 22 70 78 26 54 57 71 15 1 [817] 3 1 9 59 51 14 12 61 35 49 16 51 26 78 10 52 [833] 30 6 40 21 31 30 23 19 53 3 18 28 36 8 78 16 [849] 63 75 51 32 76 3 15 29 61 79 33 20 63 28 56 22 [865] 35 24 64 18 20 63 13 39 15 31 33 26 43 2 55 63 [881] 24 41 20 25 6 0 53 67 71 14 3 30 22 19 68 24 [897] 37 14 12 18 74 39 51 24 42 49 64 53 20 32 62 58 [913] 53 50 48 63 22 64 69 77 52 62 7 45 13 12 68 36 [929] 54 20 26 45 21 25 78 56 19 50 69 9 37 42 8 16 [945] 62 75 62 20 9 71 20 45 9 42 31 58 35 69 21 34 [961] 62 52 6 37 65 10 16 2 68 72 24 47 26 72 63 75 [977] 49 15 40 13 29 41 79 50 50 20 79 24 25 2 49 60 [993] 73 18 31 7 51 45 20 69 44 40 51 64 64 28 44 5 [1009] 19 19 78 54 45 52 29 36 61 51 48 32 77 6 68 77 [1025] 74 42 21 67 65 29 62 70 56 61 20 16 66 58 71 69 [1041] 48 60 17 35 13 14 18 19 8 25 49 47 12 18 1 62 [1057] 26 17 76 56 45 48 76 48 66 55 39 53 31 26 43 61 [1073] 2 52 59 16 5 59 31 17 2 10 72 72 61 3 62 54 [1089] 38 53 35 3 56 77 9 71 28 18 79 12 12 31 11 74 [1105] 33 71 28 4 37 48 41 28 79 32 9 68 14 36 49 71 [1121] 30 55 63 9 51 25 44 2 9 27 60 5 71 3 18 8 [1137] 75 69 51 44 16 12 69 8 23 21 8 53 20 53 25 1 [1153] 14 10 73 42 11 32 54 75 12 41 32 44 47 15 27 30 [1169] 10 44 31 39 30 35 61 57 26 34 14 24 52 9 53 7 [1185] 13 66 49 14 43 34 40 71 60 19 49 77 37 60 58 67 [1201] 22 15 2 25 34 77 55 30 49 63 61 13 21 29 53 14 [1217] 1 64 64 59 33 57 59 44 35 19 57 4 33 14 34 43 [1233] 58 36 20 29 78 26 13 39 0 7 70 79 43 77 59 23 [1249] 17 21 52 58 73 79 15 66 37 56 6 45 56 40 7 79 [1265] 77 8 9 6 38 68 71 45 0 2 8 46 57 63 32 46 [1281] 49 33 39 33 40 30 3 18 23 0 60 19 31 64 62 54 [1297] 29 44 15 64 52 72 3 0 13 43 73 50 25 6 57 70 [1313] 42 16 77 9 56 52 67 75 55 73 77 65 47 9 51 11 [1329] 68 52 69 77 67 78 0 74 53 49 31 10 23 47 69 71 [1345] 46 34 74 67 68 58 41 46 21 30 37 46 60 38 37 45 [1361] 18 23 73 79 44 12 54 7 77 47 19 39 72 22 11 44 [1377] 13 46 77 9 71 79 14 74 23 46 67 46 35 67 9 40 [1393] 56 11 6 30 1 53 68 70 57 44 61 36 59 43 11 23 [1409] 60 21 1 58 59 3 78 11 14 2 62 35 70 37 64 65 [1425] 22 34 65 16 8 2 39 15 57 68 79 36 38 72 52 63 [1441] 13 16 20 7 70 68 32 12 8 12 47 54 61 40 10 41 [1457] 21 67 38 28 70 24 55 44 56 69 45 59 28 73 35 31 [1473] 6 57 38 22 19 44 16 79 67 48 45 18 7 44 78 72 [1489] 32 44 35 13 54 66 72 72 4 70 15 67 69 40 37 45 [1505] 43 76 0 44 28 6 19 30 19 45 30 51 68 4 22 35 [1521] 70 47 6 55 43 50 49 75 1 23 24 19 32 42 49 59 [1537] 59 75 59 46 49 2 29 21 42 28 50 10 36 50 21 70 [1553] 67 20 51 78 11 65 7 10 66 7 19 36 30 18 77 24 [1569] 44 59 25 47 12 44 40 54 63 47 68 71 79 38 35 63 [1585] 4 17 49 45 65 78 15 67 41 13 57 3 21 29 0 41 [1601] 29 48 4 71 5 31 77 36 30 41 35 31 74 31 42 73 [1617] 12 44 15 17 44 5 63 4 60 1 54 65 33 77 64 12 [1633] 3 16 4 52 69 25 57 17 9 51 48 70 55 37 14 15 [1649] 14 17 59 25 49 57 8 33 41 61 52 43 22 0 55 76 [1665] 71 26 49 40 53 41 25 61 22 28 26 27 65 67 41 15 [1681] 78 59 37 29 76 41 48 71 47 44 47 24 77 34 71 7 [1697] 28 55 43 12 26 10 66 50 38 53 58 65 40 34 0 8 [1713] 59 11 23 61 57 17 15 20 50 0 63 6 38 8 60 37 [1729] 28 35 59 35 8 10 24 5 66 58 3 13 61 2 21 30 [1745] 64 74 78 67 0 35 21 8 16 20 31 5 1 58 53 8 [1761] 29 51 4 2 16 58 47 61 34 73 11 52 40 28 49 21 [1777] 65 75 54 4 13 50 25 26 49 18 55 56 14 39 12 16 [1793] 34 9 57 20 68 9 40 28 56 65 38 4 76 0 0 56 [1809] 22 51 5 68 1 11 36 75 51 79 64 7 4 37 69 43 [1825] 4 39 44 9 54 75 35 64 24 34 59 60 7 3 79 48 [1841] 29 12 0 32 65 59 62 55 73 65 7 70 10 25 32 13 [1857] 36 3 47 44 62 23 1 13 51 42 25 79 19 71 24 11 [1873] 65 37 67 13 46 69 32 79 9 19 22 40 62 48 32 74 [1889] 22 23 53 40 0 51 2 73 69 18 50 57 45 55 26 52 [1905] 20 2 62 79 24 22 14 51 79 28 8 74 8 5 64 1 [1921] 60 56 73 70 7 58 44 62 22 40 52 18 68 24 2 64 [1937] 43 20 72 20 71 70 25 37 46 42 65 54 1 7 43 56 [1953] 14 43 38 64 29 34 74 55 37 39 42 38 61 23 18 60 [1969] 29 4 56 27 47 35 45 38 0 4 21 60 74 10 59 42 [1985] 17 51 49 39 58 74 39 77 0 65 79 16 36 79 25 3 [2001] 35 5 52 47 70 13 31 65 69 20 71 25 49 27 1 25 [2017] 57 29 41 56 16 63 56 9 24 41 17 43 6 26 48 23 [2033] 19 72 53 66 44 59 32 57 3 11 72 39 29 3 52 36 [2049] 19 39 63 11 41 41 66 26 73 8 10 24 40 8 60 76 [2065] 35 12 19 75 42 44 61 52 1 58 29 45 52 37 62 24 [2081] 41 39 49 74 33 31 32 26 55 12 6 69 51 54 10 0 [2097] 15 32 70 27 39 55 73 40 46 11 1 27 36 69 33 11 [2113] 74 69 45 34 70 51 62 52 37 22 51 3 56 79 38 45 [2129] 1 11 75 26 31 1 33 69 69 45 16 23 61 67 60 45 [2145] 49 52 42 76 16 68 56 31 5 23 59 41 77 40 4 36 [2161] 22 11 77 75 23 66 30 66 67 56 39 3 34 51 55 2 [2177] 36 21 2 0 73 55 52 70 47 40 68 30 68 48 28 26 [2193] 59 59 25 67 60 38 15 35 63 2 38 12 57 35 61 67 [2209] 62 30 55 66 8 18 41 63 5 26 52 25 67 46 2 17 [2225] 34 69 43 78 39 78 3 21 72 29 65 62 2 27 61 15 [2241] 23 22 25 29 30 78 60 49 63 6 31 18 39 73 12 17 [2257] 77 21 21 72 11 50 11 45 15 78 22 57 53 2 2 37 [2273] 32 41 74 70 44 34 12 78 34 13 46 3 5 55 43 62 [2289] 52 13 32 40 65 57 76 78 1 35 17 1 27 43 52 15 [2305] 54 67 46 42 20 30 14 12 58 42 14 76 64 35 25 51 [2321] 52 27 36 60 79 36 14 10 45 11 43 44 69 17 8 40 [2337] 73 4 37 1 17 75 0 41 58 74 42 4 53 39 6 18 [2353] 78 46 11 46 1 24 11 71 23 2 29 57 78 74 3 70 [2369] 22 58 18 36 54 61 49 76 42 67 53 23 28 16 25 78 [2385] 71 39 44 43 57 13 35 9 52 51 54 21 23 33 65 39 [2401] 21 31 15 58 26 79 58 57 48 57 9 11 19 38 9 58 [2417] 46 18 53 2 61 76 34 27 56 5 48 63 7 20 38 0 [2433] 28 5 3 59 43 46 68 60 64 35 1 77 33 76 37 42 [2449] 23 63 30 44 7 10 4 59 74 15 40 75 69 78 43 31 [2465] 30 72 37 44 69 9 37 77 61 54 43 1 29 5 43 30 [2481] 0 65 79 1 9 16 66 56 33 38 73 64 46 53 48 20 [2497] 22 79 23 5 44 12 7 49 46 78 28 10 78 3 13 79 [2513] 55 62 12 14 29 64 48 43 61 50 42 36 72 58 50 74 [2529] 67 29 78 12 2 16 9 37 74 69 56 15 54 12 51 14 [2545] 39 63 7 26 46 59 49 40 25 75 14 70 36 48 77 49 [2561] 33 53 79 15 14 35 24 7 71 17 60 40 19 13 19 65 [2577] 22 5 38 73 79 3 7 10 57 39 36 41 3 58 25 22 [2593] 51 39 19 17 73 75 2 11 54 57 76 61 56 5 71 72 [2609] 34 36 57 76 28 57 49 64 68 78 23 17 27 38 75 64 [2625] 60 2 13 28 29 73 19 18 13 6 42 25 43 40 6 2 [2641] 23 41 26 21 45 57 72 17 67 62 40 41 42 46 2 43 [2657] 66 63 46 54 73 58 43 20 6 47 26 70 28 19 29 10 [2673] 24 26 6 51 63 11 34 24 79 31 36 5 60 63 20 74 [2689] 18 20 35 42 41 63 32 61 3 25 7 40 14 4 16 11 [2705] 10 12 51 57 57 68 19 25 71 8 60 34 24 63 79 17 [2721] 33 18 16 53 3 37 5 70 10 15 31 46 45 9 38 78 [2737] 38 79 38 28 10 26 69 55 69 25 39 41 76 69 28 54 [2753] 31 71 59 79 9 43 51 63 57 64 43 8 50 23 4 69 [2769] 14 50 41 13 5 43 66 41 18 78 69 69 0 43 50 64 [2785] 17 56 44 12 40 60 68 67 49 73 53 39 64 3 36 3 [2801] 14 1 67 22 27 50 69 67 21 34 12 27 66 47 68 20 [2817] 67 11 25 43 33 28 30 33 40 56 6 79 22 12 4 27 [2833] 62 41 2 28 75 19 9 5 5 5 0 78 16 35 62 12 [2849] 41 77 76 34 14 71 24 5 66 49 41 16 75 55 20 50 [2865] 33 73 28 43 53 75 15 50 27 39 45 63 22 50 70 5 [2881] 26 71 56 67 21 16 31 18 9 18 61 3 61 53 36 55 [2897] 18 33 20 7 65 15 0 55 76 58 26 12 49 76 48 41 [2913] 47 69 44 67 56 74 64 56 61 57 39 39 58 60 57 33 [2929] 79 51 5 2 8 30 22 38 40 76 49 37 10 25 78 73 [2945] 9 36 38 7 72 27 58 44 29 73 27 58 53 60 73 61 [2961] 30 50 62 34 10 17 9 55 64 64 76 40 48 41 76 25 [2977] 39 6 56 56 6 33 39 43 65 74 70 73 4 17 14 76 [2993] 50 7 4 3 65 10 6 12 0 14 45 63 55 39 70 59 [3009] 12 22 69 44 30 54 32 75 31 33 33 25 27 22 4 21 [3025] 1 19 0 40 4 17 38 75 28 13 29 59 10 5 2 16 [3041] 23 22 8 57 14 78 9 39 68 40 51 14 32 16 48 72 [3057] 50 77 11 42 15 42 5 51 16 26 56 48 48 62 42 62 [3073] 3 62 42 79 16 56 64 2 26 29 26 39 61 71 17 36 [3089] 9 32 1 51 75 78 21 48 14 76 58 55 75 67 45 41 [3105] 1 14 76 39 60 45 30 73 50 33 78 73 64 46 33 7 [3121] 47 37 30 75 41 49 42 28 18 3 24 31 8 3 17 55 [3137] 20 1 77 3 6 26 75 16 22 32 28 51 27 17 18 67 [3153] 28 77 67 66 78 41 19 71 6 21 72 52 37 9 47 70 [3169] 2 28 60 7 29 40 78 55 41 2 24 11 11 54 43 62 [3185] 22 48 8 52 0 0 66 61 59 59 34 41 32 4 4 75 [3201] 78 30 54 61 27 2 25 62 49 60 5 72 4 77 67 58 [3217] 28 2 62 28 11 74 4 3 47 34 3 48 64 39 71 31 [3233] 79 24 75 4 50 5 60 21 35 27 10 46 24 59 19 57 [3249] 48 75 77 3 73 41 79 30 37 55 73 73 45 27 10 65 [3265] 31 53 63 16 36 15 35 71 0 5 7 63 22 8 46 45 [3281] 33 26 63 25 26 78 49 25 19 75 39 66 40 33 20 28 [3297] 21 38 76 24 75 65 11 34 63 24 58 25 70 47 57 33 [3313] 25 75 25 23 26 62 20 68 51 62 76 28 2 8 72 38 [3329] 62 32 29 10 57 9 19 34 38 39 26 21 13 28 46 45 [3345] 78 79 36 13 77 17 13 47 45 0 37 12 34 42 27 46 [3361] 74 64 44 16 26 60 23 32 8 68 5 50 29 27 52 53 [3377] 5 33 72 76 48 49 0 43 79 23 65 64 72 40 13 18 [3393] 50 40 30 59 41 68 72 38 4 17 33 43 6 72 5 1 [3409] 58 32 59 30 59 47 9 27 50 58 35 22 31 19 37 9 [3425] 45 29 39 51 33 23 12 36 1 60 39 10 17 59 0 25 [3441] 54 10 19 75 2 29 34 52 50 29 51 58 56 48 44 15 [3457] 73 20 45 43 41 5 53 34 19 47 37 47 76 15 34 71 [3473] 64 19 4 50 30 21 55 55 41 74 15 54 33 68 4 74 [3489] 51 0 62 14 3 63 50 42 49 66 22 57 12 8 5 20 [3505] 25 78 32 25 28 58 18 7 73 6 62 58 79 23 68 54 [3521] 35 48 27 23 50 1 0 52 32 55 17 54 43 41 35 64 [3537] 47 40 65 30 18 0 29 54 65 20 27 7 68 65 0 58 [3553] 24 44 75 23 71 7 21 71 10 21 56 66 66 55 66 51 [3569] 52 47 8 24 50 56 68 12 2 1 13 32 30 61 64 5 [3585] 41 3 35 32 23 20 70 51 4 9 62 15 74 8 55 65 [3601] 39 12 54 42 1 16 42 22 66 22 48 3 38 4 23 22 [3617] 76 15 33 27 30 75 12 71 67 67 27 27 9 28 35 70 [3633] 37 76 27 18 54 63 74 51 1 66 77 50 66 61 75 52 [3649] 70 46 63 20 46 72 76 29 75 72 68 21 56 19 28 21 [3665] 36 10 13 9 50 47 21 47 78 50 24 39 7 11 41 60 [3681] 7 49 21 65 16 11 17 13 64 72 74 12 45 74 47 27 [3697] 24 66 3 40 20 26 1 5 45 15 34 20 17 21 40 52 [3713] 71 20 74 13 4 7 14 78 74 42 24 37 8 48 33 24 [3729] 41 51 61 56 76 0 49 10 36 21 65 26 16 76 69 59 [3745] 21 1 45 19 6 69 32 70 40 78 61 37 46 60 15 42 [3761] 21 10 42 32 56 17 74 2 28 1 21 1 38 64 27 78 [3777] 52 31 59 15 67 74 31 62 65 34 24 3 24 42 37 34 [3793] 57 41 2 56 70 34 16 71 68 53 48 32 0 24 33 13 [3809] 35 11 56 24 37 40 31 66 43 47 45 36 19 37 31 67 [3825] 46 5 46 56 46 58 33 11 65 18 69 4 76 7 71 36 [3841] 54 71 14 17 7 77 36 2 23 55 79 22 68 65 13 61 [3857] 20 61 68 61 33 42 8 26 15 53 11 36 5 29 17 49 [3873] 42 27 17 45 39 44 43 36 58 72 53 2 18 10 11 54 [3889] 50 38 34 19 58 53 78 27 68 73 35 1 65 12 24 44 [3905] 24 13 17 30 5 15 50 64 37 50 64 33 10 11 30 45 [3921] 35 38 4 10 35 78 63 25 21 76 26 27 56 3 53 65 [3937] 68 73 54 55 35 36 75 11 46 66 19 4 6 61 64 25 [3953] 32 30 43 70 6 37 5 59 37 5 73 77 23 73 66 25 [3969] 26 19 18 58 0 9 53 37 10 57 61 44 31 11 47 36 [3985] 41 6 13 11 57 28 6 49 10 72 59 8 64 74 19 14 [4001] 65 19 52 45 61 53 43 47 74 63 42 44 55 8 31 38 [4017] 71 45 29 43 34 56 31 59 9 10 57 53 71 62 79 63 [4033] 10 58 61 26 53 45 17 17 22 44 14 67 74 55 18 18 [4049] 3 42 60 51 23 27 37 77 15 37 69 8 34 14 48 63 [4065] 11 48 22 11 46 64 66 52 52 11 0 63 17 44 22 68 [4081] 55 25 68 15 17 55 16 60 20 23 3 24 26 10 20 16 [4097] 60 72 20 22 31 63 53 37 37 62 16 8 16 69 30 63 [4113] 1 46 71 53 10 71 18 28 28 31 29 11 15 62 70 76 [4129] 5 34 5 36 55 73 48 30 75 51 5 49 3 10 5 21 [4145] 51 6 74 4 75 45 53 6 40 48 59 33 65 50 22 15 [4161] 11 44 35 70 70 15 66 24 46 13 70 21 37 31 59 67 [4177] 33 0 75 52 61 23 51 43 9 55 66 17 63 74 19 22 [4193] 27 11 9 52 18 13 53 26 26 51 56 39 9 41 50 30 [4209] 67 48 31 16 49 11 11 2 28 62 53 30 49 79 32 31 [4225] 54 19 51 41 14 9 4 29 71 11 12 34 69 4 12 31 [4241] 18 52 73 13 23 47 2 76 34 32 23 68 22 41 36 52 [4257] 62 37 9 49 34 33 8 70 18 67 25 43 26 3 22 39 [4273] 61 47 61 38 69 9 17 61 36 74 39 21 70 48 69 58 [4289] 54 30 48 67 2 5 34 43 30 29 9 58 32 10 11 9 [4305] 52 47 17 45 27 32 53 16 40 35 54 72 1 25 51 6 [4321] 32 66 13 71 36 36 47 31 66 20 40 28 27 68 18 75 [4337] 15 28 63 46 37 69 54 29 13 75 22 79 29 69 36 50 [4353] 77 32 25 32 57 57 37 62 6 76 47 72 5 66 45 70 [4369] 48 24 78 56 32 73 70 49 20 24 13 35 19 38 39 56 [4385] 70 33 67 38 73 63 69 4 62 7 4 9 32 76 30 11 [4401] 7 75 23 79 7 56 16 64 47 59 40 60 65 0 17 63 [4417] 13 69 61 44 42 50 15 73 22 48 50 40 13 70 74 31 [4433] 48 44 39 46 79 78 27 26 28 8 28 5 39 10 79 49 [4449] 66 34 74 6 24 30 41 24 70 61 27 34 65 45 58 57 [4465] 35 47 57 35 29 55 1 11 24 71 33 79 45 29 62 55 [4481] 78 38 27 20 63 19 32 33 22 52 17 74 6 46 44 55 [4497] 72 58 23 40 38 60 0 65 14 18 68 21 0 4 19 69 [4513] 57 67 46 6 31 2 10 29 6 70 24 67 35 0 56 66 [4529] 8 41 24 63 74 74 51 7 51 31 66 58 28 75 23 60 [4545] 75 49 7 8 68 8 70 37 72 30 53 15 48 30 79 53 [4561] 20 8 1 26 60 44 30 0 50 40 16 35 39 57 77 41 [4577] 58 37 66 34 50 27 76 8 65 47 7 9 14 6 48 13 [4593] 8 6 29 6 72 26 60 1 38 16 74 8 50 7 33 17 [4609] 72 15 32 66 45 72 50 63 12 17 47 46 14 35 77 4 [4625] 76 4 14 8 68 23 65 25 45 3 18 13 26 0 69 22 [4641] 72 31 42 35 40 24 60 38 77 3 59 60 4 54 71 64 [4657] 76 62 21 38 78 26 6 59 77 10 25 74 31 67 77 42 [4673] 50 10 17 54 79 20 30 63 46 38 5 28 53 32 30 1 [4689] 54 76 45 37 36 32 13 13 58 74 46 60 57 9 40 60 [4705] 50 26 0 58 14 31 8 69 5 34 74 78 35 44 71 19 [4721] 74 27 17 71 49 34 73 68 49 16 69 31 51 57 45 69 [4737] 2 52 76 71 5 52 13 77 69 22 42 79 42 27 36 16 [4753] 12 70 25 33 66 17 75 67 67 66 32 3 79 49 70 19 [4769] 62 5 2 60 10 56 78 7 25 56 30 66 2 1 45 32 [4785] 36 26 63 58 39 28 15 73 10 64 78 64 17 60 16 67
\[\begin{aligned} \text{mean}(\mathbf x)&=\frac{1}{N}\sum_i x_i\\ &=\bar{\mathbf x}\end{aligned}\]
mean(x)
is the average of x
\(\bar{\mathbf x}\) is the value that results in the smallest squared error
mean(pop_LD)
[1] 4.5
mean(pop_MD)
[1] 14.5
mean(pop_HD)
[1] 39.5
\[\text{var}(\mathbf x)=\frac{1}{N}\sum_i (x_i-\bar{\mathbf x})^2\]
Variance is the mean squared error of the average
We explained this on the last semester
var(pop_LD)
[1] 8.25
var(pop_MD)
[1] 74.9
var(pop_HD)
[1] 533
\[\begin{aligned} \text{sd}(\mathbf x)&=\sqrt{\text{var}(\mathbf x)}\\ &=\sqrt{\frac{1}{N}\sum_i (x_i-\bar{\mathbf x})^2}\end{aligned}\]
sd(pop_LD)
[1] 2.87
sd(pop_MD)
[1] 8.66
sd(pop_HD)
[1] 23.1
We care about sd(x)
because it tells us how close is the mean to most of the population
Russian mathematician Chebyshev It can be proved that always \[\Pr(\vert x_i-\bar{\mathbf x}\vert\geq k\cdot\text{sd}(\mathbf x))\leq 1/k^2\]
In other words, the probability that “the distance between the mean \(\bar{\mathbf x}\) and any element \(x_i\) is bigger than \(k\cdot\text{sd}(\mathbf x)\)” is less than \((1/k^2)\)
It is always valid, for any probability distribution
(Later we will see better rules valid only sometimes)
It can also be written as \[\Pr(\vert x_i-\bar{\mathbf x}\vert\leq k\cdot\text{sd}(\mathbf x))\geq 1-1/k^2\]
The probability that “the distance between the mean \(\bar{\mathbf x}\) and any element \(x_i\) is less than \(k\cdot\text{sd}(\mathbf x)\)” is greater than \(1-1/k^2\)
Another way to understand the meaning of this theorem is \[\Pr(\bar{\mathbf x} -k\cdot\text{sd}(\mathbf x)\leq x_i \leq \bar{\mathbf x} +k\cdot\text{sd}(\mathbf x))\geq 1-1/k^2\] Replacing \(k\) for some values, we get \[\begin{aligned} \Pr(\bar{\mathbf x} -1\cdot\text{sd}(\mathbf x)\leq x_i \leq \bar{\mathbf x} +2\cdot\text{sd}(\mathbf x))&\geq 1-1/1^2=0\\ \Pr(\bar{\mathbf x} -2\cdot\text{sd}(\mathbf x)\leq x_i \leq \bar{\mathbf x} +2\cdot\text{sd}(\mathbf x))&\geq 1-1/2^2=0.75\\ \Pr(\bar{\mathbf x} -3\cdot\text{sd}(\mathbf x)\leq x_i \leq \bar{\mathbf x} +3\cdot\text{sd}(\mathbf x))&\geq 1-1/3^2=0.889 \end{aligned}\]
stats.libretexts.org
For any numerical data set
pop_HD
These values should be less than 1, 0.25 and 0.11
mean(abs(pop_HD-mean(pop_HD))> 1*sd(pop_HD))
[1] 0.425
mean(abs(pop_HD-mean(pop_HD))> 2*sd(pop_HD))
[1] 0
mean(abs(pop_HD-mean(pop_HD))> 3*sd(pop_HD))
[1] 0
one_sample <- function(m, population) { return(sample(population, size=m)) } one_sample(30, pop_HD)
[1] 32 40 62 43 71 47 44 17 41 27 65 11 69 23 46 19 7 [18] 30 29 34 17 58 70 75 53 15 68 57 29 35
Moreover, it is often different from the population average
This explains why rural areas have the highest and lowest cancer rates
It is because the groups are smaller, so averages are taken from smaller groups
When the sample size is big,
the sample average is closer to
the population average
plot(log(sd_sample_mean)~log(size)) model_HD <- lm(log(sd_sample_mean)~log(size)) lines(predict(model_HD)~log(size))
coef(model_HD)
(Intercept) log(size) 3.242 -0.528
\[\log(\text{sd_sample_mean}) = 3.242 + -0.528\cdot\log(\text{size})\] \[\text{sd_sample_mean} = \exp(3.242) \cdot\text{size}^{-0.528} =25.587\cdot\text{size}^{-0.528}\]
\[\text{sd_sample_mean} = A\cdot \text{size}^B\]
A | B | std dev population | |
---|---|---|---|
pop_LD | 3.282 | -0.5306 | 2.873 |
pop_MD | 9.092 | -0.5126 | 8.656 |
pop_HD | 25.59 | -0.528 | 23.09 |
Coefficient \(A\) is the standard deviation of the population
Coefficient \(B\) is -0.5
If we know the population standard deviation, we can predict the sample standard deviation
\[\text{sd(sample mean)}=\frac{\text{sd(population)}}{\sqrt{\text{sample size}}}\]
Using Chebyshev formula, we know that, with high probability \[\vert \text{mean(sample)} -\text{mean(population)}\vert < k\cdot\frac{\text{sd(population)}}{\sqrt{\text{sample size}}}\]
Therefore the population average is inside the interval \[\text{mean(sample)} \pm k\cdot\frac{\text{sd(population)}}{\sqrt{\text{sample size}}}\] (probably)
Remember that we do not know neither the population mean nor the population variance
So we do not know the population standard deviation 😕
In most cases we can use the sample standard deviation
English explorer, Inventor, Anthropologist
(1822–1911)
Cousin of Charles Darwin
He studied medicine and mathematics at Cambridge University.
He invented the phrase “nature versus nurture”
In his will, he donated funds for a professorship in genetics to University College London.
We will simulate each ball one by one
one_ball <- function(M) { return(sum(sample(c(-1,1), size=M, replace=TRUE))) }
Here M
is the number of “left-right” choices made by the ball
Galton <- replicate(1000, one_ball(5)) barplot(table(Galton))
Galton <- replicate(10000, one_ball(50)) barplot(table(Galton))
It is easy to see that the population mean is 0 for any M
If we think, we can show that the variance is M
Standard deviation will be sqrt(M)
Galton <- replicate(10000, one_ball(5))/sqrt(5) barplot(table(Galton))
Galton <- replicate(10000, one_ball(50))/sqrt(50) barplot(table(Galton))
Galton <- replicate(100000, one_ball(500))/sqrt(500) barplot(table(Galton))
Galton <- replicate(100000, one_ball(5000))/sqrt(5000) barplot(table(Galton))
This “bell shaped” curve is found in many experiments, specially when they involve the sum of many small contributions
It is called Gaussian distribution, or also Normal distribution
Instead of simulating the Galton machine several times, we can simulate the Normal distribution using the R function
rnorm(n, mean = 0, sd = 1)
The parameter n
is mandatory. It is the sample size
You can also change the mean and the standard deviation of the simulation
In Class 8 (and Homework 8) we predicted the final outcome of the water formation for each value of r1_rate
, and fixing the other values
rates_of_r1 <- 10^seq(from=-3, to=-1, length=50) r2_rate=0.01, H_ini=2, O_ini=1, W_ini=0
Now we are not so sure about how much hydrogen we had at the beginning. Instead of H_ini=2
fixed, we will simulate taking values from
H_values <- rnorm(n=6, mean=2, sd=0.05)