In class 4 we learned how to calculate the GC-content of each gene. Now we will calculate GC skew, defined as
\[\frac{G-C}{G+C}\]
Draw a scatter plot of GC content and GC skew of each E.coli gene. Use GC content in the horizontal axis and GC skew in the vertical axis.
- Write the function
calculate_GC_skew()
and usesapply()
to apply it over each gene. - Do the same for GC content.
- Write the function
Calculate the AT skew. Draw a scatter plot of GC skew and AT skew.
- You should create a new function
calculate_AT_skew()
and usesapply()
again.
- You should create a new function
The DNA sequences on FASTA files represents one of the strands. In many times we need to know the other strand.
To do so we need to calculate the reverse-complement sequence. That is, a sequence where the first letter is the complement of the last letter in the original sequence.
The complement of “A” is “T”, and vice versa. The complement of “C” is “G”, and the other way around.
Please write, in English, a detailed plan to get the reverse complement of a DNA sequence