Please download the file quiz3.R and write your results there
The Italian town of Volterra has a big problem with mice1 Mice is the plural of mouse.. There are many rats all around the town, running day and night, and eating all the cheese. The major of the city, Don Vito,
Don Vito, Major of Volterra. got some cats to deal with the problem, but after a few years the mice came back again and again. Every time they think that cats have eaten all mice, the rodents return and everybody complains to the major.
To understand why this happens, Don Vito has hired you as an expert on data analysis. You will work with Dr. Alfred J. Lotka,
Dr. Lotka, expert on cats and mice. who will provide you with the data.
Dr. Lotka has calculated that, if there are no cats, every day the mouse population increases by a birth_rate
equal to 1.3% on average.
The mice run through the city streets every day and night, because the cats are hunting them. These cats can only survive and reproduce by catching mice. If a cat catches a mouse, the mouse dies and a new cat is born. Naturally, the number of mice captured (and cats born) depends on the mouse population size, the cat population size, and on how good are the cats on catching mice, which we call catch_rate
. On average, if a cat meets a mouse, the mouse is captured 1.6% of times. So mice are really scared and are running at all times. If you don’t run, you die, they say in mouse language.
The cats are also running all day and night, because if they do not catch a mouse, they starve and die. Dr. Lotka has measured that the death_rate
, is equal 0.7% every day. If you don’t run, you die, they say in cat language.
On the first day of the study there is 1 mouse and 3 cats. With the data available, the Major asks you: Will the cats eat all mice?2 We choose these numbers to have an easy exercise. If you are concerned about fractional cats, you can think that these numbers represent thousands of cats and mice.,
To figure out we will build a computational model. The system can be represented by this diagram:
Your mission is to write a function called cat_and_mouse
to simulate the cat and mouse population for many days. The inputs of this function must be:
N
, the number of days that we will simulate,birth_rate
, the birth rate for mice,catch_rate
, the death rate of mice and birth rate of cats,death_rate
, the death rate of cats.All inputs are mandatory. The output of the function must be a data frame with four columns: mice
, d_mice
, cats
, and d_cats
.
If everything is right, you can test your function with the following code that show the history of cats and mice for the next 2000 days:
ans <- cat_and_mouse(N=2000, birth_rate=1.3/100,
catch_rate=1.6/100, death_rate=0.7/100)
plot(ans$cats, xlab="Days", ylab="Population", type="l")
lines(ans$mice, lty=2)
legend("top", legend=c("cats","mice"), lty = 1:2, inset=c(0,-0.2),
horiz = TRUE, xpd = TRUE)
We can see that at the beginning the cats catch many mice, eating almost all of them. Then everybody is happy and think that cats win and the problem is finished. But then there are so few mice that cats begin to starve and die. After near two years there are so few cats that the mice population start growing and the town is full of rodents again. So the Major can see that cats will not solve the problem.
We can also see the relationship between cats and mice in the state space diagram3 The state of the system is the value of all parts of the system.. In this plot we can see what happens in the house in general, independent of the day.
Dr. Lotka thinks that these mice have some kind of genetic resistance to the cats, so we should analyze the mouse genome. He wants to see if there is any atypical gene in the mouse chromosome 19 that can be useful in his research. Your mission is to identify which genes are not like the others.
Please download the file mouse-chr19-genes.fna
and store it on your disk. Then write the code to load the FASTA file in R and find how many genes are in the chromosome. The result should be:
## [1] 2329
Now you have to calculate the GC content4 This is not the GC-skew we saw in class. The GC content is \[\frac{G+C}{A+C+G+T}.\] of each gene. Fortunately the seqinr library provides a function called GC()
that takes a single DNA sequence and returns the GC content. Please write the code to calculate the GC content of each gene and store it on the vector gc_genes
. If everything is right you should be able to plot the following figure:
Most of the genes have similar GC content. We want to know which gene has the highest GC content and which one as the lowest. Please write the code to find each one of these genes.
All this cat and mouse game has been bad for tourism. To revert this situation Don Vito, the Major of the city, wants to make an art exposition to show how computers can handle cats and mice. He asks you to write a program to draw a mouse like the one in the figure5 Drawing the cat is a homework.. You can use Turtle graphics or the normal plot()
function.
If you use Turtle graphics, remember that you can move the turtle without drawing using the command turtle_setpos(
x,
y)
, and you can choose any angle using turtle_setangle(angle)
. For your convenience, you can also use the functions turtle_polygon(N,D)
and turtle_circle(D)
:
turtle_polygon <- function(N, D) {
side <- D * sinpi(1/N)
turtle_do({
turtle_left(90-180/N)
for(x in 1:N) {
turtle_forward(side)
turtle_right(360/N)
}
})
}
turtle_circle <- function(D) {
turtle_polygon(180, D)
}
In both functions the parameter D
indicates the diameter of the polygon or circle. When you use turtle_polygon()
the parameter N
indicates the number of sides on the polygon. For example turtle_polygon(3)
draws a triangle. The center of the polygon is always in the direction where the turtle was looking.
The mouse body has diameter 36, the head diameter is 20 and the ears have diameter 10. The eyes have diameter 4 and the nose size is 2.5.
When you get home and have more time, write also the code to draw the cat.
This story is fictional. If you want to know the real story, ask Google about “Lotka Volterra”
In the output of cat_and_mouse()
we can find the minimum size of the mice population, given birth_rate
, catch_rate
and death_rate
. It is natural to ask how small will be the mouse population depending on the different rates. For example, you can plot min(ans$mice)
versus catch_rate
, another plot depending on birth_rate
, etc.
You can also see if increasing the number of cats will reduce the number of mice in the long term. For that you have to modify cat_and_mouse()
and add a cats_ini
input.
In the analysis of chromosome 19 we can see that most of genes are above average in the first part of the chromosome, then below average, then over again. That is, there are three regions in the chromosome. Can you find the limits of these regions? In other words, can you find which genes are in the border of each region?
In the drawing contest, Can you draw the cat?
Remember: The more exercises you do, the more chances you have to pass the course.