December 7, 2018
quiz2
There is a file quiz2.sh
in your folder. You will edit it with nano
nano quiz2.sh
For each of the next questions, please write the corresponding commands after the question
If you like, you can open a second connection to the server
world_2007.txt
Afghanistan 31889923 Asia 43.828 974.5803384 Albania 3600523 Europe 76.423 5937.029526 Algeria 33333216 Africa 72.301 6223.367465 Angola 12420476 Africa 42.731 4797.231267 Argentina 40301927 Americas 75.32 12779.37964
world_2007.txt
awk 'NR<=5' world_2007.txt
Afghanistan 31889923 Asia 43.828 974.5803384 Albania 3600523 Europe 76.423 5937.029526 Algeria 33333216 Africa 72.301 6223.367465 Angola 12420476 Africa 42.731 4797.231267 Argentina 40301927 Americas 75.32 12779.37964
world_2007.txt
France 61083916 Europe 80.657 30470.0167 Gabon 1454867 Africa 56.735 13206.48452 Gambia 1688359 Africa 59.448 752.7497265 Germany 82400996 Europe 79.406 32170.37442 Ghana 22873338 Africa 60.022 1327.60891 Greece 10706290 Europe 79.483 27538.41188
world_2007.txt
awk 'NR<=50 && NR>=45' world_2007.txt
France 61083916 Europe 80.657 30470.0167 Gabon 1454867 Africa 56.735 13206.48452 Gambia 1688359 Africa 59.448 752.7497265 Germany 82400996 Europe 79.406 32170.37442 Ghana 22873338 Africa 60.022 1327.60891 Greece 10706290 Europe 79.483 27538.41188
Turkey 71158647 Europe 71.777 8458.276384
awk '$1=="Turkey"' world_2007.txt
Turkey 71158647 Europe 71.777 8458.276384
world_2007.txt
142
world_2007.txt
awk 'END {print NR}' world_2007.txt
142
Write the command to find how many lines of world_2007.txt
contain the exact word “Rep”, but not “Republic”
Congo,_Dem._Rep. 64606759 Africa 46.462 277.5518587 Congo,_Rep. 3800610 Africa 55.322 3632.557798 Korea,_Dem._Rep. 23301725 Asia 67.297 1593.06548 Korea,_Rep. 49044790 Asia 78.623 23348.13973 Yemen,_Rep. 22211743 Asia 62.698 2280.769906
Write the command to find how many lines of world_2007.txt
contain the exact word “Rep”, but not “Republic”
awk '/Rep[^u]/' world_2007.txt
Congo,_Dem._Rep. 64606759 Africa 46.462 277.5518587 Congo,_Rep. 3800610 Africa 55.322 3632.557798 Korea,_Dem._Rep. 23301725 Asia 67.297 1593.06548 Korea,_Rep. 49044790 Asia 78.623 23348.13973 Yemen,_Rep. 22211743 Asia 62.698 2280.769906
Nigeria 135031164 Africa 46.859 2013.977305
awk '$3=="Africa" && $2==135031164' world_2007.txt
Nigeria 135031164 Africa 46.859 2013.977305
3600523 Albania 8199783 Austria 10392226 Belgium 4552198 Bosnia_and_Herzegovina 7322858 Bulgaria 4493312 Croatia 10228744 Czech_Republic 5468120 Denmark 5238460 Finland 61083916 France 82400996 Germany 10706290 Greece 9956108 Hungary 301931 Iceland 4109086 Ireland 58147733 Italy 684736 Montenegro 16570613 Netherlands 4627926 Norway 38518241 Poland 10642836 Portugal 22276056 Romania 10150265 Serbia 5447502 Slovak_Republic 2009245 Slovenia 40448191 Spain 9031088 Sweden 7554661 Switzerland 71158647 Turkey 60776238 United_Kingdom
awk '$3=="Europe" {print $2,$1}' world_2007.txt
3600523 Albania 8199783 Austria 10392226 Belgium 4552198 Bosnia_and_Herzegovina 7322858 Bulgaria 4493312 Croatia 10228744 Czech_Republic 5468120 Denmark 5238460 Finland 61083916 France 82400996 Germany 10706290 Greece 9956108 Hungary 301931 Iceland 4109086 Ireland 58147733 Italy 684736 Montenegro 16570613 Netherlands 4627926 Norway 38518241 Poland 10642836 Portugal 22276056 Romania 10150265 Serbia 5447502 Slovak_Republic 2009245 Slovenia 40448191 Spain 9031088 Sweden 7554661 Switzerland 71158647 Turkey 60776238 United_Kingdom
Sort the output of the last command, from biggest to smallest. Use |
and sort
82400996 Germany 71158647 Turkey 61083916 France 60776238 United_Kingdom 58147733 Italy 40448191 Spain 38518241 Poland 22276056 Romania 16570613 Netherlands 10706290 Greece 10642836 Portugal 10392226 Belgium 10228744 Czech_Republic 10150265 Serbia 9956108 Hungary 9031088 Sweden 8199783 Austria 7554661 Switzerland 7322858 Bulgaria 5468120 Denmark 5447502 Slovak_Republic 5238460 Finland 4627926 Norway 4552198 Bosnia_and_Herzegovina 4493312 Croatia 4109086 Ireland 3600523 Albania 2009245 Slovenia 684736 Montenegro 301931 Iceland
Sort the output of the last command, from biggest to smallest. Use |
and sort
awk '$3=="Europe" {print $2,$1}' world_2007.txt |sort -nr
82400996 Germany 71158647 Turkey 61083916 France 60776238 United_Kingdom 58147733 Italy 40448191 Spain 38518241 Poland 22276056 Romania 16570613 Netherlands 10706290 Greece 10642836 Portugal 10392226 Belgium 10228744 Czech_Republic 10150265 Serbia 9956108 Hungary 9031088 Sweden 8199783 Austria 7554661 Switzerland 7322858 Bulgaria 5468120 Denmark 5447502 Slovak_Republic 5238460 Finland 4627926 Norway 4552198 Bosnia_and_Herzegovina 4493312 Croatia 4109086 Ireland 3600523 Albania 2009245 Slovenia 684736 Montenegro 301931 Iceland
Take the sorted output of last command and pipe it into an awk command that prints the row number and the second field when the country is “Turkey”
2 Turkey 71158647
Take the sorted output of last command and pipe it into an awk command that prints the row number and the second field when the country is “Turkey”
awk '$3=="Europe" {print $2,$1}' world_2007.txt | sort -nr | awk '/Turkey/ {print NR,$2,$1}'
2 Turkey 71158647
Repeat the last command, changing “Population” for “GDP per capita”
28 Turkey 8458.276384
Repeat the last command, changing “Population” for “GDP per capita”
awk '$3=="Europe" {print $5,$1}' world_2007.txt | sort -nr | awk '/Turkey/ {print NR,$2,$1}'
28 Turkey 8458.276384
Repeat the last command, changing “GDP per capita” for “Life Expectancy”
30 Turkey 71.777
Repeat the last command, changing “GDP per capita” for “Life Expectancy”
awk '$3=="Europe" {print $4,$1}' world_2007.txt | sort -nr | awk '/Turkey/ {print NR,$2,$1}'
30 Turkey 71.777
Write an awk command that counts how many lines have “Europe” in the third field
30
Write an awk command that counts how many lines have “Europe” in the third field
awk '$3=="Europe" {n++} END {print n}' world_2007.txt
30
Quizzes are important if you want to succeed
AWK has the following built-in arithmetic functions:
int(expr) | Truncate to integer. |
rand() | Return a random number N, between 0 and 1, such that 0 ≤ N < 1. |
srand([expr]) | Use expr as the new seed for the random number generator. If no expr is provided, use the time of day. Return the previous seed for the random number generator. |
atan2(y, x) | Return the arctangent of y/x in radians. |
cos(expr) | Return the cosine of expr, which is in radians. |
sin(expr) | Return the sine of expr, which is in radians. |
exp(expr) | The exponential function. |
log(expr) | The natural logarithm function. |
sqrt(expr) | Return the square root of expr. |
Print seven random numbers from 0 to 100, inclusive:
awk 'BEGIN { for (i = 1; i <= 7; i++) print int(101 * rand()) }'
tolower(str)
str
, with all the uppercase characters in str
translated to their corresponding lowercase counterparts.toupper(str)
str
, with all the lowercase characters in str
translated to their corresponding uppercase counterparts.length([s])
s
, or the length of $0 if s
is not supplied.substr(s, i [, n])
s
starting at i
.n
is omitted, use the rest of s
.Write an awk program that changes the first word to Title Case
Our Midterm and Quiz used data from 2007
The file /home/andres/population_total.csv
has data for all years and all countries
Take a look doin this:
head /home/andres/population_total.csv
We want to change the shape of this table
The output should be in three columns
We need to use the -F
option. Something like
awk -F ',' '{print $1, 1800, $2; print $1, 1801, $3; print $1, 1802, $4; }' /home/andres/population_total.csv
with one print
command for every field
Can we do it smarter?
for
loopsLike many other computer languages, awk can repeat the same commands several times
awk -F ',' '{for(i=2; i<=NF; i++) { print $1, 1798+i, $i } }' /home/andres/population_total.csv
for
loops have four partsThe general form of a for
loop looks like this:
for(
A;
B;
C){
D}
;
(semicolon){}
for
for(
A;
B;
C){
D}
A, C and D are normal awk commands or assignments
B is a TRUE/FALSE condition
The D part is repeated while B is true
B should be FALSE sometimes, otherwise we never finish