Class 23.1: Summary of the course

Bioinformatics

Andrés Aravena

December 30, 2021

What did we learn?

Lander Waterman formula

To estimate the number of contigs in the assembly

For a genome of length \(G,\) with \(N\) reads of average length \(L,\) and an overlap threshold of \(T,\) the expected number of contigs in an assembly is \[N\exp\left(-\frac{(L-T)N}{G}\right)\] (more precisely, it is the number of gaps between contigs)