November 14th, 2016
We have data, we want to tell something about them
We use some functions to analyze our data
Have a name, several inputs and one output
Inputs are always inside round parenthesis ( )
Some inputs can be optional. They have default values
You have to read the manual using
help(function_name)
or in short version
?function_name
There are many. You have to explore and learn
length()
min()
, max()
, range()
head()
, tail()
summary()
table()
length(state.region)
[1] 50
summary(state.region)
Northeast South North Central West 9 16 12 13
table(state.region)
state.region Northeast South North Central West 9 16 12 13
length(state.abb)
[1] 50
summary(state.abb)
Length Class Mode 50 character character
table(state.abb)
state.abb AK AL AR AZ CA CO CT DE FL GA HI IA ID IL IN KS KY LA MA MD ME MI MN MO MS 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MT NC ND NE NH NJ NM NV NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV WY 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
length(state.area)
[1] 50
summary(state.area)
Min. 1st Qu. Median Mean 3rd Qu. Max. 1214 37317 56222 72368 83234 589757
table(state.area)
state.area 1214 2057 5009 6450 7836 8257 9304 9609 10577 24181 1 1 1 1 1 1 1 1 1 1 31055 33215 36291 40395 40815 41222 42244 45333 47716 48523 1 1 1 1 1 1 1 1 1 1 49576 51609 52586 53104 56154 56290 56400 58216 58560 58876 1 1 1 1 1 1 1 1 1 1 68192 69686 69919 70665 77047 77227 82264 83557 84068 84916 1 1 1 1 1 1 1 1 1 1 96981 97914 104247 110540 113909 121666 147138 158693 267339 589757 1 1 1 1 1 1 1 1 1 1
Sometimes the best way to tell the story of the data is with a graphic
plot(state.area)
plot(rivers)
Each element has a position in the x axis
The previous graphics used numeric data. What about factors?
plot(state.region)
plot(rivers)
hist(rivers)
hist(rivers, col="grey")
hist(rivers, col="grey", nclass = 30)
plot(rivers)
plot(rivers, col="red")
plot(rivers, cex=2)
plot(rivers, cex=0.5)
plot(rivers, pch=16)
plot(rivers, pch=".")
plot(rivers, type = "l")
plot(rivers, type = "o")
plot(rivers, type = "l", xlim=c(1,50))
plot(rivers, type = "o", xlim=c(100,141))
plot(rivers, type = "b")
plot(rivers, type = "n")
plot(rivers, main = "Length of Rivers", sub = "141 samples", ylab="Length [miles]")
plot(state.x77[,"Area"], ylab="Area [sq mi]") points(state.x77[,"Population"]*10, pch=2)
The first one defines the scale
plot(state.x77[,"Area"], type="l", ylab="Area [sq mi]") lines(state.x77[,"Population"]*10, col="red")