April 19, 2019
It is about epidemiology
The same models are used for real diseases
N <- 28*24 movie <- horror_movie(N, human_ini=2, dead_ini=1, zombie_ini=0, attack_rate = 5e-3, kill_rate=1e-3, voodoo_rate=1e-2)
Please make a vector called rate_of_kill
with one hundred values between 1e-3
and 1e-1
.
Then create two empty vectors called final_human
and final_zombie
with the same length of rate_of_kill
.
For each value in rate_of_kill
simulate the horror_movie
system with the same initial values, attack rate and voodoo rate, and changing kill_rate
for the corresponding value in rate_of_kill
. Store the value of human
at the last hour of each simulation in the vector final_human
. Do the same with the last value of zombie
, storing it into final_zombie
.
rate_of_kill <- seq(1e-3, 1e-1, length.out = 100) final_human <- rep(NA, length(rate_of_kill)) final_zombie <- rep(NA, length(rate_of_kill)) for(i in 1:length(rate_of_kill)) { movie <- horror_movie(N, human_ini=2, dead_ini=1, zombie_ini=0, attack_rate = 5e-3, kill_rate=rate_of_kill[i], voodoo_rate=1e-2) final_human[i] <- movie$human[N] final_zombie[i] <- movie$zombie[N] }
rate_of_kill <- seq(1e-3, 1e-1, length.out = 100) human_ini=2, dead_ini=1, zombie_ini=0, attack_rate = 5e-3, voodoo_rate=1e-2
mix <- water_formation(N=200, r1_rate=0.01, r2_rate=0.01, H_ini=2, O_ini=1, W_ini=0) plot(Water~Step, data=mix, ylim=c(0, 1), type="l")
r1_rate
rates_of_r1 <- 10^seq(from=-3, to=-1, length=5) r2_rate=0.01, H_ini=2, O_ini=1, W_ini=0
rates_of_r1 <- 10^seq(from=-3, to=-1, length=5) r2_rate=0.01, H_ini=2, O_ini=1, W_ini=0
r1_rate
valuesrates_of_r1 <- seq(from=0.001, to=0.1, length=50) r2_rate=0.01, H_ini=2, O_ini=1, W_ini=0
final_Water <- rep(NA, length(rates_of_r1)) for(i in 1:length(rates_of_r1)) { mix <- water_formation(N=100, r1_rate=rates_of_r1[i], r2_rate=0.01, H_ini=2, O_ini=1, W_ini=0) final_Water[i] <- mix$Water[100] } plot(rates_of_r1, final_Water)
We can measure how much water we got. If at the end we have 0.5 mol of water, then … … with the curve we can find what was the reaction rate
We observe nature
We want to understand Nature
We see that Nature has rules
Everything happens with a cause
The systems we have seen are used to explain what happens
They are simplifications of reality
We discard details that we think are irrelevant
If the system makes bad prediction, the system is wrong
water[i]
hydrogen[i]
In any fixed time, the state has fixed values
water[i]
, hydrogen[i]
The “arrows” of the system do not change with time
The new state depends only on the past states, nothing else
If we know the initial state and rate constant, we can calculate everything
“An intelligence knowing all the forces acting in nature at a given instant, as well as the momentary positions of all things in the universe, would be able to comprehend in one single formula the motions of the largest bodies as well as the lightest atoms in the world; to it nothing would be uncertain, the future as well as the past would be present to its eyes.”
Pierre-Simon Laplace, French mathematician and astronomer (1749 – 1827)
“all the forces acting in nature at a given instant, as well as the momentary positions of all things in the universe”
If we know the current state …
“would be able to comprehend in one single formula the motions of the largest bodies as well as the lightest atoms in the world
And we know the system…
“nothing would be uncertain, the future as well as the past would be present”
Then we can know exactly what is in the future
All the future is determined
by the initial state
and the system rules
Sometimes models are too simple and they miss important parts
If the system is realistic, we can predict
We need to measure initial values and rates
But all measurements have a margin of error
In real life we cannot measure exact values
We can only measure within a margin of error
For example, when we measure 2.0 mol of Hydrogen, it may be 1.95 or 2.05 or any value in between
H_ini
r1_rate
evaluationEven if we make errors, their effect is not very important
To study this system, we will assume that death rate and birth rate depend on a single value A
d_x[i] <- (A-1) * x[i-1] - 2 * A/2 * x[i-1]*x[i-1] x[i] <- x[i-1] + d_x[i]
in other words
x[i] <- x[i-1] + (A-1) * x[i-1] - 2 * A/2 * x[i-1]*x[i-1]
so finally
x[i] <- A * x[i-1] * (1 - x[i-1])
quad_map <- function(N, A, x_ini) { x <- rep(NA, N) x[1] <- x_ini for(i in 2:N) { x[i] <- A * x[i-1] * (1 - x[i-1]) } return(x) }
x_ini <- c(0.4, 0.45, 0.5, 0.55, 0.6) ans <- data.frame(V1=quad_map(N=100, x_ini=x_ini[1], A=2)) plot(ans$V1, ylim=c(0,1), ylab="x", type="b") for(i in 2:length(x_ini)) { ans[[i]] <- quad_map(N=100, x_ini=x_ini[i], A=2) points(ans[[i]], pch=i, type="b") }
This is called attractor
A
. Here A=3
Now there are two final states.
This is a periodic attractor
A
. Here A=3.5
Now we have four final states.
Also a periodic attractor
A
. Here A=3.8
We do not see a pattern here
A
. Here A=3.95
Similar initial states, very different results
x_ini
, big changes in resultInitial values:
x_ini
[1] 0.40 0.45 0.50 0.55 0.60
Final values:
ans[100,]
V1 V2 V3 V4 V5 100 0.07712169 0.7034133 0.8541633 0.7034133 0.4900459
Here we see the final 500 states for different values of A Homework: do this plot. More details later
The fly of a butterfly in Istanbul can produce an hurricane in Mexico
Small changes have big consequences
We cannot make exact predictions
But we can still say what is normal
What is the most probable behavior
We can identify patterns using the tools of …. (sound of drums)
People think that probabilities are about games
Instead they are really tools for thinking
Thinking about decisions when we have incomplete information
Thinking about the future
About the meaning of our experiment results
Do you travel?
Travel insurance, Health insurance
maybe other toys that are easy to understand
These are just to have easy examples
Each device has a set of possible outcomes:
For example a die has the following outcomes
⚀ ⚁ ⚂ ⚃ ⚄ ⚅
🂡 🂢 🂣 🂤 🂥 🂦 🂧 🂨 🂩 🂪 🂫 🂭 🂮 🂱 🂲 🂳 🂴 🂵 🂶 🂷 🂸 🂹 🂺 🂻 🂽 🂾 🃁 🃂 🃃 🃄 🃅 🃆 🃇 🃈 🃉 🃊 🃋 🃍 🃎 🃑 🃒 🃓 🃔 🃕 🃖 🃗 🃘 🃙 🃚 🃛 🃝 🃞 🃟
♠︎♣︎♡♢
Four symbols can be used to represent DNA
A, C, G, T
Head, Tail also written as H, T
c("H","T")
1:6
c("a","c","g","t")
LETTERS
letters
sample()
Try this
sample(LETTERS)
[1] "O" "S" "Z" "R" "C" "K" "X" "I" "J" "W" "E" "P" "H" "Y" "B" "A" "Q" [18] "N" "D" "M" "V" "F" "G" "U" "L" "T"
sample(LETTERS)
[1] "A" "X" "I" "M" "F" "G" "B" "J" "V" "D" "C" "R" "Y" "O" "U" "N" "H" [18] "W" "Q" "L" "K" "Z" "P" "T" "E" "S"
sample()
is shufflingThe output has the same elements of the input but in a different order
Each element appears only once
The order changes every time
Try this
sample(LETTERS, size=10)
[1] "R" "B" "U" "F" "W" "N" "L" "G" "Z" "O"
We get 10 letters. Some, but not all input elements
Each element appears only once
Try this
sample(c("H","T"), size=10)
Error in sample.int(length(x), size, replace, prob): cannot take a sample larger than the population when 'replace = FALSE'
We run out of elements
Try this
sample(c("H","T"), size=10, replace=TRUE)
[1] "H" "T" "T" "T" "T" "T" "H" "T" "T" "T"
Each element can appear several times
Shuffle, take one, replace it on the set
Most of times we will use sample()
with replace=TRUE
table(sample(c("a","c","g","t"), size=40, replace=TRUE))
a c g t 8 16 6 10
table(sample(c("a","c","g","t"), size=40, replace=TRUE))
a c g t 5 8 10 17
table(sample(c("a","c","g","t"), size=40, replace=TRUE))
a c g t 13 13 7 7
Each result is different
table(sample(c("a","c","g","t"), size=400, replace=TRUE))
a c g t 90 124 88 98
table(sample(c("a","c","g","t"), size=400, replace=TRUE))
a c g t 115 97 97 91
table(sample(c("a","c","g","t"), size=400, replace=TRUE))
a c g t 120 105 79 96
When size
increases, the frequency of each letter also increases
table(sample(c("a","c","g","t"), size=4000, replace=TRUE))
a c g t 1002 975 1015 1008
table(sample(c("a","c","g","t"), size=4000, replace=TRUE))
a c g t 992 995 1025 988
table(sample(c("a","c","g","t"), size=4000, replace=TRUE))
a c g t 988 998 1001 1013
When size
increases, the frequencies change less
table(sample(c("a","c","g","t"), size=40000, replace=TRUE))
a c g t 10098 9970 9950 9982
table(sample(c("a","c","g","t"), size=40000, replace=TRUE))
a c g t 9948 9992 10088 9972
table(sample(c("a","c","g","t"), size=40000, replace=TRUE))
a c g t 9980 9914 10157 9949
Each frequency is very close to 1/4 of size
table(sample(c("a","c","g","t"), size=400000, replace=TRUE))
a c g t 100165 99999 99621 100215
table(sample(c("a","c","g","t"), size=400000, replace=TRUE))
a c g t 100172 99885 99776 100167
table(sample(c("a","c","g","t"), size=400000, replace=TRUE))
a c g t 100002 100216 99604 100178
If size
increases a lot, the relative frequencies are 1/4 each
Absolute frequency is how many times we see each value
The sum of all absolute frequencies is the Total number of cases
Relative frequency is the proportion of each value in the total
The sum of all relative frequencies is always 1.
table(sample(c("a", "c", "g", "t"), size=1000000, replace=TRUE))/1000000
a c g t 0.250345 0.249476 0.249657 0.250522
What will be each relative frequency when size
is
BIG
In this case we can find it by thinking
This ideal relative frequency is called Probability
Each device or random system may have some preferred outcomes
All outcomes are possible, but some can be probable
In general we do not know each probability
But we can estimate it using the relative frequency
That is what we will do in this course
Write the code to make the plots of this class