December 10, 2018
Nobody uploaded their video. No raw data
The summary of all your data is at
https://anaraven.bitbucket.io/static/2018/cmb1/fall-raw.txt
I also analyzed my video. My data is at
https://anaraven.bitbucket.io/static/2018/cmb1/free-fall.txt
fall <- read.table("free-fall.txt", header=TRUE) fall
msec position replica 1 2355 10 A 2 2457 20 A 3 2533 30 A 4 2558 40 A 5 2639 50 A 6 2690 60 A 7 2742 70 A 8 2742 80 A 9 2797 90 A 10 1837 10 B 11 1970 20 B 12 2022 30 B 13 2104 40 B 14 2129 50 B 15 2210 60 B 16 2237 70 B 17 2262 80 B 18 2313 90 B
plot(position~msec, data=fall)
msec
. It means millisecondsLet’s add a new column, called time
, with the correct value of seconds
fall$time <- fall$msec/1000
Now, for each replica, we change time
to start at 0
old <- fall$time[fall$replica=="A"] fall$time[fall$replica=="A"] <- old - min(old) old <- fall$time[fall$replica=="B"] fall$time[fall$replica=="B"] <- old - min(old)
We should also transform position to meters. The first data should be at 0
Each black/white strip was a little more than 19 centimeters
fall$height <- fall$position/10 * 0.192 old <- fall$height[fall$replica=="A"] fall$height[fall$replica=="A"] <- min(old) - old old <- fall$height[fall$replica=="B"] fall$height[fall$replica=="B"] <- min(old) - old
fall[fall$replica=="A",]
msec position replica time height 1 2355 10 A 0.000 0.000 2 2457 20 A 0.102 -0.192 3 2533 30 A 0.178 -0.384 4 2558 40 A 0.203 -0.576 5 2639 50 A 0.284 -0.768 6 2690 60 A 0.335 -0.960 7 2742 70 A 0.387 -1.152 8 2742 80 A 0.387 -1.344 9 2797 90 A 0.442 -1.536
fall[fall$replica=="B",]
msec position replica time height 10 1837 10 B 0.000 0.000 11 1970 20 B 0.133 -0.192 12 2022 30 B 0.185 -0.384 13 2104 40 B 0.267 -0.576 14 2129 50 B 0.292 -0.768 15 2210 60 B 0.373 -0.960 16 2237 70 B 0.400 -1.152 17 2262 80 B 0.425 -1.344 18 2313 90 B 0.476 -1.536
plot(height~time, data=fall, pch=as.character(replica))
This is a parabola, which can be modeled with a second degree polynomial \[y=a + b\cdot x + c\cdot x^2\] When we use a polynomial model, we need to add new columns to our data frame
fall$time_sq <- fall$time^2
This is used only for the model
model_A <- lm(height ~ time + time_sq, data=fall, subset=replica=="A") model_A
Call: lm(formula = height ~ time + time_sq, data = fall, subset = replica == "A") Coefficients: (Intercept) time time_sq -0.001309 -1.557507 -4.266640
model_B <- lm(height ~ time + time_sq, data=fall, subset=replica=="B") model_B
Call: lm(formula = height ~ time + time_sq, data = fall, subset = replica == "B") Coefficients: (Intercept) time time_sq 0.003772 -1.083734 -4.570787
Since we choose that at time=0
we have height=0
, we know that (Intercept)
is 0
In other words, our model here is \[y= b\cdot x + c\cdot x^2\] The old \(a\) has a fixed value of 0
How do we do that in R?
new_model_A <- lm(height ~ time + time_sq + 0, data=fall, subset=replica=="A") new_model_A
Call: lm(formula = height ~ time + time_sq + 0, data = fall, subset = replica == "A") Coefficients: time time_sq -1.568 -4.249
new_model_B <- lm(height ~ time + time_sq + 0, data=fall, subset=replica=="B") new_model_B
Call: lm(formula = height ~ time + time_sq + 0, data = fall, subset = replica == "B") Coefficients: time time_sq -1.057 -4.612
If you remember your physics lessons, you can recognize that in the formula \[y= b\cdot x + c\cdot x^2\] the value \(b\) is the initial speed, and \(c=-g/2\)
Thus, we can find the gravitational acceleration in the second coefficient
The gravitational acceleration is constant and independent of the replica
But the initial speed depends on the replica
We want to combine both replicas to get
combi <- lm(height ~ time:replica + time_sq + 0, data=fall) combi
Call: lm(formula = height ~ time:replica + time_sq + 0, data = fall) Coefficients: time_sq time:replicaA time:replicaB -4.456 -1.493 -1.117
The symbol :
means interaction
fall-raw.txt
in picturestime time_sq A -0.460 -5.19 B -0.318 -4.74 C -1.417 -4.65 D -0.761 -4.25 E -1.640 -3.07 F -1.545 -4.03 G -0.354 -5.67 H -0.568 -4.27 I -1.519 -4.62
time time_sq J -0.880 -5.07 K -1.433 -3.31 L -0.822 -4.04 M -0.460 -5.19 N -0.281 -4.81 O -1.292 -4.97 P -1.110 -5.25 Q -2.248 -3.60 R -2.290 -3.40
combi <- lm(position ~ time:replica + time_sq + 0, data=fall) coef(combi)
time_sq time:replicaA time:replicaB -4.421 -0.778 -0.457 time:replicaC time:replicaD time:replicaE -1.501 -0.688 -1.058 time:replicaF time:replicaG time:replicaH -1.376 -0.941 -0.488 time:replicaI time:replicaJ time:replicaK -1.590 -1.129 -0.979 time:replicaL time:replicaM time:replicaN -0.657 -0.778 -0.452 time:replicaO time:replicaP time:replicaQ -1.490 -1.411 -1.974 time:replicaR -1.947
+0
in the formula forces the intercept to 0:
allows us to combine two variables and measure their interactionsurvey
datasurvey <- read.table("survey1-tidy.txt") colnames(survey)
[1] "Gender" "birth_day" "birth_month" [4] "birth_year" "height_cm" "weight_kg" [7] "handness" "hand_span_cm"
plot(weight_kg~Gender, data=survey)
lm(weight_kg~Gender, data=survey)
Call: lm(formula = weight_kg ~ Gender, data = survey) Coefficients: (Intercept) GenderMale 57.8 18.7
model <- lm(weight_kg ~ Gender, data = survey)
Call: lm(formula = weight_kg ~ Gender, data = survey) Coefficients: (Intercept) GenderMale 57.8 18.7
Here (Intercept)
is “normal weight”, and GenderMale
is the extra weight when Gender=='Male'
(on average)
\[\text{weight_kg} = coef(model)[1] + coef(model)[2]\cdot\text{[Gender is 'Male']}\]
lm(weight_kg ~ Gender + 0, data = survey)
Call: lm(formula = weight_kg ~ Gender + 0, data = survey) Coefficients: GenderFemale GenderMale 57.8 76.6
Now the formula includes +0
. This means “do not use intercept”. Now
GenderMale
is the average weight when Gender=='Male'
GenderFemale
is the average weight when Gender=='Female'
Let’s start with the simplest model
plot(weight_kg ~ height_cm, data=survey, col=Gender) model1 <- lm(weight_kg ~ height_cm, data=survey) points(predict(model1) ~ height_cm, data=survey, col="green3", pch=16)
model1
Call: lm(formula = weight_kg ~ height_cm, data = survey) Coefficients: (Intercept) height_cm -81.504 0.862
Obviously the weight also depends on the gender
plot(weight_kg ~ height_cm, data=survey, col=Gender) model2 <- lm(weight_kg ~ height_cm + Gender+0, data = survey) points(predict(model2) ~ height_cm, data=survey, col="green3", pch=16)
model2
Call: lm(formula = weight_kg ~ height_cm + Gender + 0, data = survey) Coefficients: height_cm GenderFemale GenderMale 0.279 11.688 26.896
There are two lines with the same slope: is the coefficient for height_cm
The “intercept” of each line are the coefficients GenderFemale
and GenderMale
+0
in the formula:
to indicate interaction*
in formulas