November 7th, 2016
At the end of the exam you sent me a file
How can we verify that we have the same file?
How can we be sure that nobody changed it?
How to be sure without showing the content of the file?
An answer to these question is given by digital signatures
They are not digital pictures of a handwritten signature
Instead they are a unique number that identifies the exact document
This number is called digest. It is produced by a crypotgraphic hash function
Go to http://onlinemd5.com/ or any other service you find on Google
The evaluation is done in your computer. The file is not sent by the internet
You can take the file you attached, get the digest and compare with the one I created
If they are the same we are sure that I have your file
And we do not need to show the content
Imagine you are working in a project
You can get the MD5 digest and publish it
We have seen that R has several data types
There are many others. For example functions
In Math and Informatics, a function is a “black box”
A rule to transform the input elements into an output
The same input should produce always the same output
Notice that there may be more than one input element
Functions have one name and several inputs
Inputs are always inside parenthesis ( )
Some inputs can be optional. They have default values
list.files(path = ".", pattern = NULL, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE)
Here all inputs are optionals. The default value is shown
help(topic, package = NULL, lib.loc = NULL, verbose = getOption("verbose"), try.all.packages = getOption("help.try.all.packages"), help_type = getOption("help_type"))
Here topic
is a mandatory input. The rest are optional
help(function_name)
or in short version
?function_name
dir() getwd() setwd(dir) c(...) factor(...) list(...) data.frame(...)
We have data, we want to tell something about them
How can we summarize all the values in a few numbers?
Let’s use the vector rivers
.
length(rivers)
nrow(state.x77)
dim(state.x77)
table(state.division)
length(rivers)
[1] 141
nrow(state.x77)
[1] 50
dim(state.x77)
[1] 50 8
table(state.region)
state.region Northeast South North Central West 9 16 12 13
If you have to describe the vector v
with a single number x
, which would it be?
If we have to replace each one of v[i]
for a single number, which number is “the best”?
Better choose one that is the “less wrong”
How can x
be wrong?