November 29, 2018
With experimental data from the big coil we found the linear model \[\text{n_marbles}=A+B\cdot \text{length}\] and we compared it with the formula from Hooke’s Law \[\text{force}=K\cdot(L-\text{length})\]
Each marble has mass \(m\). The force points down, so \[-m g\cdot\text{n_marbles}=K\cdot(L-\text{length})\] which can be re-written as \[\text{n_marbles}=\underbrace{-\frac{K}{m g}\cdot L}_{A} + \underbrace{\frac{K}{m g}}_{B}\cdot\text{length} \]
In this case it is easier to use centimeters, grams and seconds
Thus, force is measured in dyne, and length in cm
Looking at Hooke’s law \[\text{force}=K\cdot(L-\text{length})\] we can see that \(K\) is measured in dyne/cm
Our model is \[\text{n_marbles}=\underbrace{-\frac{K}{m g}\cdot L}_{\texttt{coef(model)[1]}} + \underbrace{\frac{K}{m g}}_{\texttt{coef(model)[2]}}\cdot\text{length}\] therefore \[K=\texttt{coef(model)[2]}\cdot m\cdot g\]
Taking the mass of marbles as 20gr, we calculate the elasticity constant as
coef(model)[2] * 20 * 9.8
length 37.5
The units are dyne/cm.
The label length
comes from coef(model)[2]
The number is the same as before. The units are correct this time
This is the same as last class. Since
\[\texttt{coef(model)[1]}=-\frac{K}{m g}L = -\texttt{coef(model)[1]}\cdot L \] we can calculate \(L\) as \[L=-\frac{\texttt{coef(model)[1]}}{\texttt{coef(model)[2]}} = 75.922\]
We need very little math for our course: arithmetic, algebra, and logarithms
If \(x=p^m\) then \[\log_p(x) = m\]
If we use another base, for example \(q\), then \[\log_q(x) = m\cdot\log_q(p)\]
So if we use different bases, there is only a scale factor
The easiest one is natural logarithm
Basic linear model \[y=A+B\cdot x\] Exponential \[y=I\cdot R^x\qquad\log(y)=log(I)+log(R)\cdot x\] Power of \(x\) \[y=C\cdot x^E\qquad\log(y)=log(C)+E\cdot\log(x)\]
The easiest way to decide is to draw several plots, placing log()
in different places, and looking which one seems more like a straight line
For example, let’s analyze data from Kleiber’s Law (Physiological Reviews 1947 27:4, 511-541)
The following data shows a summary. The complete table has 26 animals
animal | kg | kcal |
---|---|---|
Mouse | 0.021 | 3.6 |
Rat | 0.282 | 28.1 |
Guinea pig | 0.410 | 35.1 |
Rabbit | 2.980 | 167.0 |
Cat | 3.000 | 152.0 |
Macaque | 4.200 | 207.0 |
Dog | 6.600 | 288.0 |
animal | kg | kcal |
---|---|---|
Goat | 36.0 | 800 |
Chimpanzee | 38.0 | 1090 |
Sheep ♂ | 46.4 | 1254 |
Sheep ♀ | 46.8 | 1330 |
Woman | 57.2 | 1368 |
Cow | 300.0 | 4221 |
Young cow | 482.0 | 7754 |
plot(kcal ~ kg, data=kleiber)
plot(log(kcal) ~ kg, data=kleiber)
plot(log(kcal) ~ log(kg), data=kleiber)
The plot that seems more straight line is the log-log plot.
Therefore we need a log-log model.
Depending on the context, we may want to use different versions of semi-log and log-log plots
For understanding the data, we do
plot(log(kcal) ~ kg, data=kleiber)
For publishing in a paper, we do
plot(kcal ~ kg, data=kleiber, log="y")
plot(log(kcal) ~ kg, data=kleiber)
plot(kcal ~ kg, data=kleiber, log="y")
plot(log(kcal) ~ log(kg), data=kleiber)
plot(kcal ~ kg, data=kleiber, log="xy")
The plot that seems more straight line is the log-log plot
Therefore we need a log-log model.
model <- lm(log(kcal) ~ log(kg), data=kleiber) coef(model)
(Intercept) log(kg) 4.206 0.756
If \[\log(kcal)=4.206 + 0.756\cdot \log(kg)\] then \[kcal=\exp(4.206) \cdot kg^{0.756}\]
Therefore:
“An animal’s metabolic rate scales to the ¾ power of the animal’s mass”.
Google it
Models are the essence of scientific research
They provide us with two important things
predict(model, newdata)
where newdata
is a data frame with column names corresponding to the independent variables
If we omit newdata
, the prediction uses the original data
as newdata
predict(model) == predict(model, newdata=data)
animal | kg | kcal | predicted |
---|---|---|---|
Mouse | 0.021 | 3.6 | 1.28 |
Rat | 0.282 | 28.1 | 3.25 |
Guinea pig | 0.410 | 35.1 | 3.53 |
Rabbit | 2.980 | 167.0 | 5.03 |
Cat | 3.000 | 152.0 | 5.04 |
Macaque | 4.200 | 207.0 | 5.29 |
Dog | 6.600 | 288.0 | 5.63 |
animal | kg | kcal | predicted |
---|---|---|---|
Goat | 36.0 | 800 | 6.92 |
Chimpanzee | 38.0 | 1090 | 6.96 |
Sheep ♂ | 46.4 | 1254 | 7.11 |
Sheep ♀ | 46.8 | 1330 | 7.11 |
Woman | 57.2 | 1368 | 7.26 |
Cow | 300.0 | 4221 | 8.52 |
Young cow | 482.0 | 7754 | 8.88 |
We want to predict the metabolic rate, depending on the weight
The independent variable is \(kg\), the dependent variable is \(kcal\)
But our model uses only \(\log(kg)\) and \(\log(kcal)\)
So we have to undo the logarithm, using \(\exp()\)
predicted_kcal <- exp(predict(model))
animal | kg | kcal | predicted |
---|---|---|---|
Mouse | 0.021 | 3.6 | 3.62 |
Rat | 0.282 | 28.1 | 25.76 |
Guinea pig | 0.410 | 35.1 | 34.19 |
Rabbit | 2.980 | 167.0 | 153.11 |
Cat | 3.000 | 152.0 | 153.89 |
Macaque | 4.200 | 207.0 | 198.46 |
Dog | 6.600 | 288.0 | 279.29 |
animal | kg | kcal | predicted |
---|---|---|---|
Goat | 36.0 | 800 | 1007 |
Chimpanzee | 38.0 | 1090 | 1049 |
Sheep ♂ | 46.4 | 1254 | 1220 |
Sheep ♀ | 46.8 | 1330 | 1228 |
Woman | 57.2 | 1368 | 1429 |
Cow | 300.0 | 4221 | 5001 |
Young cow | 482.0 | 7754 | 7157 |
plot(log(kcal) ~ log(kg), data=kleiber) lines(predict(model) ~ log(kg), data=kleiber)
## Visually
plot(kcal ~ kg, data=kleiber, log="xy") lines(exp(predict(model)) ~ kg, data=kleiber)
A idea originated around 1970, by George Moore from Intel
The simple version of this law states that processor speeds will double every two years
More specifically, it says that the number of transistors on an affordable CPU would double every two years
(see paper)
plot(count~Date, data=trans)
plot(log(count) ~ Date, data=trans)
we have straight line on the semi-log
That is, log(y)
versus x
\[\log(y)=log(I)+log(R)\cdot x\] In this case the original relation is \[y=I\cdot R^x\]
model <- lm(log(count) ~ Date, data=trans) exp(coef(model))
(Intercept) Date 7.83e-295 1.41e+00
plot(count ~ Date, data=trans, log="y") lines(exp(predict(model)) ~ Date, data=trans)
Every year processors grow by a factor of
exp(coef(model)[2])
Date 1.41
In his book, John Lanchester says
"I was playing on Red only yesterday – I wasn’t really, but I did have a go on a machine that can process 1.8 teraflops.
"This Red equivalent is called the PS3: it was launched by Sony in 2005 and went on sale in 2006.
"Red was [the size of] a tennis court, used as much electricity as 800 houses, and cost US$55 million. The PS3 fits under the TV, runs off a normal power socket, and you can buy one for £200.
"[In 10 years], a computer able to process 1.8 teraflops went from being something that could only be made by the world’s richest government […], to something a teenager could expect [as a gift].