Staying at home for long periods can affect your muscles. We need to exercise, keeping our muscles in shape. The same happens with our brain. Here there are some exercises to keep your brain in shape.
You can —in fact, you should— discuss the homework in the forum. Solving problems is a collective work. But answers should be individual, using the official template for answers. This way you also practice for the midterm exam, and for real life.
Recycled questions
In Homework 5 we had two “long term” optional questions. We had extra time to think about them. Now it the time to answer them.
I will also recycle the advice I gave earlier.
- Be sure to understand the question. If you do not understand, ask in the forum. Explain what do you understand and what you do not understand.
- This is LEGO. Identify all the pieces and understand how they connect.
- Write the name of each variable. That is, the name of each input,
output and auxiliary variables.
- Write what is the structure of each variable. Is it a vector, a list, a data frame, etc.?
- Write what is the type of each variable. Is it numeric, logic, character, etc.?
- You can only use the inputs and auxiliary variables that you create.
- If you did not create it, do not change it.
- The output should change if the input change. Check that each input is used somewhere.
- Understand what you have and what you want to have. It is like a biochemistry process. How do you get Leucine from ATP and water?
- Sometimes it is useful to work backwards. Start with what you want
to have, and decompose it in simpler terms. For example, to get the
GC-skew, you need to know
nG
andnC
. The you just need to findnG
andnC
. - Look at previous examples and recycle them, by using the old
functions inside the new function.
- If you cannot recycle, then you can adapt the old code for the new case.
- It is always wise to return to our very first class: How to Solve It.
1. Algorithm design
In many important cases we have a vector x
with growing
values. That is, each value is bigger or equal to the previous one,
so
x[i+1] >= x[i]
for all values of the index i
. It is easy to see that
the position of the minimum value has to be 1. We also know that the
position of the maximum value is the last position. What about the
position of the half value?
The half value is the average of the minimum and the
maximum. For example if x
is the vector
c(1, 4, 4, 6, 10, 15)
then the half value is
(1+15)/2
, that is 8.
Some people get confused between the values of a vector and the positions in a vector. The index indicates the position. Do not confuse them. This is important.
The position of the half value of the vector x
is the index of the first value that is equal or bigger
than the half value of x
. In the example the
position of the half value is 5, since x[5]
is the
smallest value that is bigger or equal than 8.
Please write a function called position_of_half()
, with
one input called x
. The function must return a single
number, which is the index of the smallest value in x
that
is bigger than or equal to the average of minimum and maximum of
x
.
You can test your functions with the following code.
<- 1:9
x position_of_half(x)
position_of_half(x + 20)
position_of_half(x * x)
position_of_half(sqrt(x))
The answers should be 5, 5, 7, 4, respectively.
2. Merge two sorted vectors
Please write a function called vector_merge(x, y)
that
receives two sorted vectors x
and
y
and returns a new vector with the elements of
x
and y
together sorted. The
output vector has size length(x)+length(y)
.
You must assume that each of the input vectors is already sorted.
in your code you have to use three indices: i
,
j
, and k
; to point into x
,
y
and the output vector answer
, respectively.
On each step you have to compare x[i]
and
y[j]
. If x[i] < y[j]
then you make
answer[k] <- x[i]
, otherwise make
answer[k] <- y[j]
.
You have to increment i
or j
, and
k
carefully. To test your function, you can use this
code:
<- c("a", "d", "e", "h", "i", "k", "m", "s", "t", "u", "v", "w", "z")
x <- c("b", "c", "f", "g", "j", "l", "n", "o", "p", "q", "r", "x", "y")
y vector_merge(x, y)
The output must be a sorted alphabet.
"a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m"
"n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"
New questions
These questions are easier than the previous one. You may want to start with these.
3. Transcription
Write a function called transcribe()
, that takes a DNA
sequence (a vector of character) and returns the corresponding RNA
sequence.
In other words, if you have
<- c("T","C","A","G","A","T","T","A","C") dna
then we transcribe it with transcribe(dna)
. The result
should be
"U" "C" "A" "G" "A" "U" "U" "A" "C"
4. Codons
Write a function called codons()
, that takes a RNA
sequence (a vector of character) and returns a list of vectors. Each
vector in the list must have only 3 letters, representing the codon.
5. Translation
Write a function called translate_codon()
, that takes a
codon (a vector with 3 letters), and returns a single letter
representing one amino acid. For your convenience, here you have the
correspondence between codons and amino acids.
What are the “*“?
CODON | L | CODON | L | CODON | L | CODON | L |
---|---|---|---|---|---|---|---|
aaa | K | caa | Q | gaa | E | taa | * |
aac | N | cac | H | gac | D | tac | Y |
aag | K | cag | Q | gag | E | tag | * |
aat | N | cat | H | gat | D | tat | Y |
aca | T | cca | P | gca | A | tca | S |
acc | T | ccc | P | gcc | A | tcc | S |
acg | T | ccg | P | gcg | A | tcg | S |
act | T | cct | P | gct | A | tct | S |
aga | R | cga | R | gga | G | tga | * |
agc | S | cgc | R | ggc | G | tgc | C |
agg | R | cgg | R | ggg | G | tgg | W |
agt | S | cgt | R | ggt | G | tgt | C |
ata | I | cta | L | gta | V | tta | L |
atc | I | ctc | L | gtc | V | ttc | F |
atg | M | ctg | L | gtg | V | ttg | L |
att | I | ctt | L | gtt | V | ttt | F |
6. Reverse complement
This function can be decomposed in two parts:
reverse()
, taking a vector and returning it backwardscomplement()
, replacing each letter in the vector by its complement
Write a function called reverse_complement()
, that takes
a DNA sequence (a vector of character) and returns the DNA sequence of
the opposite strand. Both sequences are represented from 5’ to 3’.
In other words, if you have
<- c("T","C","A","G","A","T","T","A","C") dna
then applying reverse_complement(dna)
we should getThe complement of “A” is “T”, the complement of “C”
is “G”, and vice-versa.
"G" "T" "A" "A" "T" "C" "T" "G" "A"
Stay safe, work at home, do the homework.