This course is an introduction to genomics, with emphasis on metagenomics. To prepare for the midterm exam, here you can find the keywords and main ideas we have discussed on classes.
DNA Sequencing
- What is a FASTQ file? What is the difference with FASTA?
- What is a DNA read?
- What is the quality score of a nucleotide?
- What is the length of a DNA read?
- What is the batch size of a sequencing machine?
Sequencing technologies
- For each of the common sequencing technologies, please describe:
- typical lengths of DNA reads
- typical batch size
- typical cost per batch
- What is shotgun sequencing? How you do that?
- What is targeted sequencing? How you do that?
Pairwise Alignment
- What is the difference between a global, a semiglobal and a local alignment?
- What is the score of an alignment? (not necessarily
the optimal one)
- How do you calculate it?
- What is an optimal algnment?
- Describe the basic ideas of the method used to find the optimal alignment
- What is the name of the method?
- What is the E-value of an alignment?
- How are the E-value and the score related?
- What is a PAM matrix?
- How were PAM matrices evaluated?
- What is a BLOSUM matrix?
- How are BLOSUM and PAM matrices related?
- Why the costs of gaps have two values: an opening and an extending cost?
- Why we usually do not use the Smith-Waterman method to find the optimal local alignment?
- What is an heuristic?
- What is BLAST?
- What is the difference between BLAST and Smith-Waterman?
- Are Smith-Waterman and BLAST results always equal? Why?