The disks store a huge amount of data
To organize it we use files
To organize the files we use folders
also called directories
You probably know about computer folders
They are an example of hierarchical structure
Key idea:
Like the main memory, a file is just a list of bytes
The meaning of the file depends on the context
Usually, the name of the file suggests a context
For example, an MP3 file is probably audio
Besides the data itself, files have metadata
That is, data about the data. For example
You should learn how to read them
The names of the files are “words”:
a series of letters, numbers and some symbols
Technically, a filename is a string or list of characters
Maximum length of a filename is 250 characters
You can use
A
-Z
, a
-z
),0
-9
),.
, -
,
and _
You cannot use any of these symbols:
/
, :
, +
, |
, <
, *
, >
, "
and '
You can use
(space) and non-english letters (like ǧ
or ñ
)
but I recommend not to use them, because they may cause problems
In some systems small caps and BIG CAPS are not equivalent
For example HOMEWORK.txt
and homework.txt
are different
Be careful. Be systematic and coherent:
If the filename includes .
, the text after it is called extension
In Microsoft Windows® extensions are usually 3 letters
For example
It is a suggestion on how to interpret the file
It is useful to separate computer files in two groups:
It is very hard to understand a binary file without a computer
It can only be read by the program that made it
Most of these programs are private
If the company goes out of business, you lose your data
New versions of the program may not read the old files
Free
Never get obsolete
(doc
or docx
)
The natural way to represent a text document is to encode each letter with a single byte
There is a basic standard for English, called ASCII
30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 | 110 | 120 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | ( | 2 | < | F | P | Z | d | n | x | |
1 | ) | 3 | = | G | Q | [ | e | o | y | |
2 | 4 | > | H | R | \ | f | p | z | ||
3 | ! | + | 5 | ? | I | S | ] | g | q | { |
4 | " | , | 6 | @ | J | T | ^ | h | r | | |
5 | # | - | 7 | A | K | U | i | s | } | |
6 | $ | . | 8 | B | L | V | ` | j | t | ~ |
7 | % | / | 9 | C | M | W | a | k | u | |
8 | & | 0 | : | D | N | X | b | l | v | |
9 | ´ | 1 | ; | E | O | Y | c | m | w |
Each number from 0 to 127 is either a symbol or a special signal
Non-English languages use numbers between 128 and 255 for symbols like “Ç”, “Ö”, “É”, “Ñ”
https://en.wikipedia.org/wiki/Binary_file https://en.wikipedia.org/wiki/Text_file https://en.wikipedia.org/wiki/Directory_(computing) https://en.wikipedia.org/wiki/List_of_file_formats
For this course we will use the new version of R and Rstudio. These two tools work together. Install R first, then install Rstudio.
These videos may help you
Fill the survey at the course homepage
Visit dry-lab.org/blog/2020/cmb1
and fill the survey