This course is an introduction to genomics, with emphasis on metagenomics. To prepare for the final exam, here you can find the keywords and main ideas we have discussed on classes.
DNA Sequencing
- What is a FASTQ file? What is the difference with FASTA?
- What is a DNA read?
- What is the quality score of a nucleotide?
- What is the length of a DNA read?
- What is the batch size of a sequencing machine?
Sequencing technologies
- For each of the common sequencing technologies, please describe:
- typical lengths of DNA reads
- typical batch size
- typical cost per batch
- What is shotgun sequencing? How you do that?
- What is targeted sequencing? How you do that?
Pairwise Alignment
- What is the difference between a global, a semiglobal and a local alignment?
- What is the score of an alignment? (not necessarily
the optimal one)
- How do you calculate it?
- What is an optimal algnment?
- Describe the basic ideas of the method used to find the optimal alignment
- What is the name of the method?
- What is the E-value of an alignment?
- How are the E-value and the score related?
- What is a PAM matrix?
- How were PAM matrices evaluated?
- What is a BLOSUM matrix?
- How are BLOSUM and PAM matrices related?
- Why the costs of gaps have two values: an opening and an extending cost?
- Why we usually do not use the Smith-Waterman method to find the optimal local alignment?
- What is an heuristic?
- What is BLAST?
- What is the difference between BLAST and Smith-Waterman?
- Are Smith-Waterman and BLAST results always equal? Why?
Genome Assembly
- methods
- Ab initio
- overlay - layout - consensus
- De Brujin
- with template
- Ab initio
- definitions
- read
- contig
- assembly
- scaffold
- coverage
- depth
- Lander-Waterman
- quality indices: N50
Primer design
- why melting temperature is relevant
- how do we calculate melting temperature
- composition based
- nearest neighbors
- where should we look for primers
Multiple alignment
- cost of multiple alignments
- heuristics for multiple alignments
- dendogram
- representing alignments with matrices
Alignment free methods
- k-mer counting
- Naive bayes
Finding Binding Sites
- what is a motif
- Position specific score matrices
- logo representation
- how to find a binding site when you have a PSSM
Finding Motifs
- Motif discovery v/s Scanning
- Gibbs sampling methods
- Montecarlo methods