what it is and what it isn’t
{r setup, include=FALSE} knitr::opts_chunk$set(echo = FALSE, dev.args=list(bg="transparent"), fig.align="center", dev="png", cache=FALSE)
for this course
Measuring gene expression
Mostly about statistics
Sayres, et al. “Bioinformatics Core Competencies for
Undergraduate Life Sciences Education.”
PLoS ONE 13, no. 6 (2018): 1–20. https://doi.org/10.1371/journal.pone.0196878.
Understand the role of computation and data mining in hypothesis-driven processes within the life sciences
Understand computational concepts used in bioinformatics
Know statistical concepts used in bioinformatics
Know how to access genomic data
Be able to use bioinformatics tools to analyze genomic data
Know how to access gene expression data
Be able to use bioinformatics tools to analyze gene expression data
Know how to access proteomic data
Be able to use bioinformatics tools to examine protein structure and function
Know how to access metabolomic and systems biology data
Be able to use bioinformatics tools to examine the flow of molecules within pathways/networks
Be able to use bioinformatics tools to examine metagenomics data
Know how to write short computer programs as part of the scientific discovery process
Be able to use software packages to manipulate and analyze bioinformatics data
Operate in a variety of computational environments to manipulate and analyze bioinformatics data
We focus on How to understand results
about bioinformatics
{r fig.width=4.5, fig.height=5.5} library(readr) sequencingcostdata <- read_delim("../../../static/sequencingcostdata.txt", "\t", escape_double = FALSE, col_types = cols(Date = col_date(format = "%b-%y")), trim_ws = TRUE) library(ggplot2) qplot(x=Date, y=`Cost per Genome`, data=sequencingcostdata, log="y", colour="red") + geom_ribbon(fill="red", alpha=0.2, aes(ymin=1e3, ymax=`Cost per Genome`)) + theme(legend.position="none", plot.background = element_rect(fill = "transparent", colour = NA))
In 2001, the cost of sequencing the first human genome was USD 108
Today you can have your own genome for 1000 USD
The problem is no longer how to do the experiment
Instead is how do we make sense of the results
According to Microsoft
There are three large data repositories
These three databases interchange all sequence data
but they may have different structure
All data is available for free
Research payed with public money must be uploaded here
Good journals also require to upload data