- Programs in awk consist of condition–action pairs.
- conditions are either true or false
- actions are blocks of commands, wrapped in
{}
- An action without a condition always runs.
- A condition without an action runs the default
{ print $0 }
.
November 28, 2018
{}
{ print $0 }
.awk 'statements' file1 file2 ...
There are many automatic variables in awk, such as
$1
, $2
, and so are the fields of each record
$0
is the complete input recordNF
is the Number of Fields in the current record
NR
is the Number of the current Record
This program is equivalent to the cat
command
awk '{ print $0 }'
It copies the standard input to its standard output
It also works with one or more files
awk '{ print $0 }' file1 file2 ...
If we want to use several commands in an action, we have to separate them with ;
For example this program prints the number of words of each line and then each line
awk '{ print NF; print $0 }'
The output has 2 lines for each line in the input
So far we have seen only one command: print
There are several more commands, starting with assignments
Example
x = 1
The symbol =
is used to assign a new value to a variable
variable = value
Variables are created automatically when you assign them
Variables can be numeric or text
To create a value we can use the basic arithmetic operators
+
addition, sum-
subtraction, difference*
multiplication, product/
division, quotient%
reminder, modulo^
exponentiation, powera = 4*x^2 - 2*x + 1; b = n / 2; c = n % 2;
It is very common that we do this
x = x + y
The value of x
is incremented by y
There is a shorter and faster command for this
x += y
x += increment
Add increment
to the value of x
.x -= decrement
Subtract decrement
from the value of x
.x *= coefficient
Multiply the value of x
by coefficient
.x /= divisor
Divide the value of x
by divisor
.x %= modulus
Set x
to its remainder by modulus
.x ^= power
Raise x
to the power power
.To assign a text constant, it must be inside ""
name = "Andres" surname = "Aravena"
The only operation valid for text variables is concatenation
full_name = name " " surname
We just put the text variables or constant together. There is no symbol for concatenation
Text and numbers can be mixed when it makes sense
For example, if a=1
and b="2"
, then
a + b
is equal to 3
b
is used as a number, since +
is a number operationa b
is equal to "12"
a
is used as a text, since concatenation is a text operationThe result depends on the operations
If we use the value of a variable that has never been used before, the result is an empty text ""
Thus, we can do this
all = all $0
and the variable all
will collect all the text on the file
If the empty text ""
is used in a numeric context, then its value is 0
Thus, we can do this
n += NF
After processing all the file, the variable n
will contain the number of words of the file
==
, !=
, <
, >
, <=
, >=
BEGIN
, END
&&
, ||
and !
a == b
a
is equal to b
. Comparison uses ==
, assignment uses =
a != b
a
is not equal to b
a < b
a
is less than b
a > b
a
is greater than b
a <= b
a
is less than or equal to b
a >= b
a
is greater or equal to b
We write regular expressions surrounded by //
/regex/
is true if any part of the record matches regex
$2 ~ /regex/
is true if the second field matches regex
In general you can use ~
(tilde) to see if any variable matches a regular expression
BEGIN
and END
These special conditions are only true once in every program
BEGIN
is true before reading any record
END
is true after reading all records
Print every line that has at least one field:
awk 'NF > 0' data
This is an easy way to delete blank lines from a file
Print the total number of bytes used by files:
ls -l | awk '{ x += $5 } END { print "total bytes: " x }'
Print the total number of kilobytes used by files:
ls -l | awk '{ x += $5 } END { print "total Kilobytes:", x / 1024 }'
Print a sorted list of the countries:
awk -F: '{ print $1 }' gapminder-2007.txt | sort
(this is like cut
)
Count the lines in a file:
awk 'END { print NR }' gapminder-2007.txt
(this is like wc -l
)
Print the even-numbered lines in the data file:
awk 'NR % 2 == 0' data
If we use NR % 2 == 1
instead, the program prints the odd-numbered lines