Many disciplines, including Molecular Biology and Genetics, have become more and more data driven.
Starting today, we will use R, a free software for data analysis
Most users of R are molecular biologists, but it is also used by economists, psychologists and marketing specialists
Install R and RStudio in your computer
You have to execute RStudio
Then you will see a screen like this…
RStudio, as almost all serious programs, is controlled by the keyboard
The mouse can be used for some shortcuts,
but the real deal is the keyboard
A goal of this course is to become comfortable with the keyboard
These tools are for people who read books and don’t watch TV
We use the keys `
, "
, {
, }
, [
,]
, and Tab.
The keys in red are “dead keys”.
`
, "
, {
, }
, [
,]
, and Tab`
a lot. Find it!AltGr
+,
first, and then SPACE to get the symbol `
#
: Hash. Used for comments$
: Dollar. Used for column names{
and }
: Braces, curly brackets[
and ]
: Brackets, used for indices`
: Back tick. Used for code'
and "
: single quote and double quote. Used for text/
and \
: slash and backslashR version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
[…]
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
>
This >
symbol is called prompt
You do not write the >
part. This is a message from the computer to you
You write after the prompt
verb
From “New Oxford American Dictionary”
>
(An interactive session)
>
and repeat
In Rstudio you can press Tab and get superpowers!
You can also repeat and edit previous commands using the arrows
You can delete all the line using Escape
Write the number after >
. Do not write >
[1] 42
The grey part is what we write, the blue part is the computer’s response
Writing Numbers, with decimals
Most countries use ,
to decimal separator
In USA they use .
as separate the integer and decimal parts
In theory you can use any of them, but it is easier to use dot .
Compare 520000000000 against 52000000000
Are they the same? Which one is bigger?
It is better to use exponential notation
52 × 1010 versus 52 × 109
In the computer we write powers of 10 as E
52E10 versus 52E9
See more at https://en.wikipedia.org/wiki/Metric_prefix
There are different names for the same number, and different numbers for the same name
The short names are mostly used in USA, the long names are used in most other countries.
See more at https://en.wikipedia.org/wiki/Billion
Order matters
“Parentheses, Exponents, Multiplication and Division, Addition and Subtraction”
(Please Excuse My Dear Aunt Sally)
Compare 2-3-4
v/s 2-(3-4)
[1] -5
[1] 3
Compare 2/3/4
v/s 2/(3/4)
\[\frac{\frac{2}{3}}{4}=\frac{2}{3}\cdot\frac{1}{4}=\frac{2}{12}\] \[\frac{2}{\frac{3}{4}}=\frac{2}{1}\cdot\frac{4}{3}=\frac{8}{3}\]
Use the language correctly
This is important in computing, in science, and in life.
If we can calculate
[1] 100
How do we calculate \(\sqrt{100}\)?
[1] 10
If we can calculate
[1] 100
How do we calculate \(\log_{10}(100)\)?
[1] 2
log()
: Logarithm
exp( )
: exponential
abs( )
: absolute value
sign( )
: sign -1, 0 or 1
floor(x)
: Integer just below x
ceiling(x)
: Integer just after x
round(x)
: Integer closest to x