This course is an introduction to Data Science. The goal is to learn the basic tools that allow any scientist to handle structured data and extract valuable scientific information from it.
This page will be updated during the semester. Please check it regularly.
The forum of the course is at https://groups.google.com/d/forum/iu-cmb. You can also participate writing an email to iu-cmb@googlegroups.com.
Slides used in classes
These slides contain the subjects that were evaluated on the Midterm Exam. New slides will be available at the last minute. Please take good notes on classes to improve your learning.
- How this course works. Learning strategies. [Slides]
- Why “Computing in Molecular Biology”. What is a computer. [Slides]
- Information in computers. How different data is represented in the computer. What is a structured document. Why they are important and useful. [Slides]
- Rmarkdown. [Slides]
- Using R and RStudio. Basic usage of RStudio. Introduction to R. Data Types. Vectors. [Slides]
- Basic objects in R. Vectors, matrices, and how to look inside them. [Slides]
- Lists and Data frames. How to look inside objects and change things. Indexing with positive numbers, negative numbers, logic vectors or with names. [Slides]
- Telling stories. Descriptive statistics. Functions and asking for help. A parenthesis on digital signatures. [Slides]
- Essential descriptive statistics. Median, mean, quartiles, quantiles. [Slides]
- Data Visualization. Telling stories with pictures. “One image worths a thousand words”. Plots, barplots, histograms. Making “nice” drawings. Adding points and lines. [Slides]
- Practice in Rmarkdown [Rmd file], [Final Document]
- Interacting with the real world. Reading text files. Formulas. Scatter plots. A-B-lines. [Slides]
- Exercises. Making nicer plots. [Slides], [Rmd file], [Final Document]
- Some summary statistics. Row and Column averages. [Slides]
- Graphics depend on the type of data. Numeric vs factor vs factor vs numeric. Boxplots. [Slides]
- Homework about History of Science. Due on Monday 19th December. [Slides]
- The “App Store”. Get more functions with packages. Get more packages. Exercises with knitr package. [Slides], [Rmd file], [Final Document]
- Analyzing earthquake data. Using the knitr package to make nicer documents. Maps. Energy. [Slides]
- Exercises and homework analyzing earthquake magnitude and energy. Homework due Monday 19th December.