There are several ways to plot in R
In this class we show the basic R graphic commands
We will use them only for data exploration
We need more time to make these plots nicer
We will not study that in this course, because we will use a better system
# A tibble: 117 x 10
answer_date id english_level sex birthdate birthplace height_cm
<date> <chr> <chr> <chr> <date> <chr> <dbl>
1 2018-09-17 3e50… I can speak … Male 1993-02-01 -/Turkey 179
2 2018-09-17 479d… I can unders… Fema… 1998-05-21 Kahramanm… 168
3 2018-09-17 39df… I can read a… Fema… 1998-01-18 Batman/Tu… NA
4 2018-09-17 d2b0… I can read a… Male 1998-08-29 Antalya/T… 170
5 2018-09-17 f22b… I can read a… Fema… 1998-05-03 Izmir/Tur… 162
6 2018-09-17 849c… İngilizce bi… Fema… 1995-10-09 Yalova/Tu… 167
7 2018-09-17 8381… I can speak … Fema… 1997-09-19 Adıyaman/… 174
8 2018-09-17 b0dd… I can read a… Male 1997-11-27 Bursa/Tur… 180
9 2018-09-17 2972… I can read a… Fema… 1999-01-02 Istanbul/… 162
10 2018-09-17 72c0… I can read a… Fema… 1998-10-02 Istanbul/… 172
# … with 107 more rows, and 3 more variables: weight_kg <dbl>,
# handedness <chr>, hand_span <dbl>
“one image worths a thousand words”
Sometimes the best way to understand the data is a graphic
Each value has a different position in the horizontal axis
The vector’s index is a number from 1 to length(vector)
The vertical axis represent the value of the element
So if vector[3]
contains the value 170
, we will have a point at the coordinates (3, 170)
Notice the broken line when there are missing values
The type depends on the story you want to tell
Lines are mostly used to tell a story of change through time
students
Using over is better to see the individual points in the line
If you do not specify, the default is type="p"
For example, the number of new COVID-19 cases each day
Use xlim
for horizontal range and ylim
for vertical
Numeric vectors are shown element by element
barplot()
works well with table()
This can also be written as
plot()
can handle factorsWhen the vector is a factor, plot()
does all the hard work
Level order is important here
In a numeric vector usually all values are different
We have to group them in “similar” sets
Numeric data can be grouped into classes
Histogram bars are not separated
It is a graphical version of summary()
.
plot()
shows a graphic of a vectorbarplots()
work well for small vectors
barplot(table())
for factorsplot()
the vectorhistogram()
gives a better view of large numeric vectorsboxplot()
is another way to see large vectors
barplots()
and boxplot()
$
and you will plot all data frame instead of a vector
If we ask to plot two numeric vectors, we get the first in the horizontal axis and the second in the vertical axis
Instead of
we can write
or even
plot()
, one vector, two vectors, or a formulaplot(y ~ x)
looks like plot(x, y)
plot(y~x, data=dframe)
plot(dframe$x, dframe$y)