We have reads that are close related to a reference genome
For example, RNA messengers
Or DNA from a new individual from the same species
Or DNA from a species on the same genus
Or reads that we want to map to an assembled genome
The answer depends on the case
There are several tool for the same goal
Two tools that are popular today are:
They have a similar philosophy
Can you find others?
bwa
First, it makes an index of the genome
Then it aligns the reads to the genome
We use the extension .sam
. These are large text
files
Sometimes .sam
files are encoded in smaller
.bam
binary format files