awk

AWK

Great resources:

This page is still work in progress

awk is a powerful line-by-line text processor.

There exists some flavours:

AWK - original from AT&T
NAWK - A newer, improved version from AT&T
GAWK - GNU AWK (from the Free Software foundation)

This article will cover gawk.

The documentation from GNU Awk is really good!

General

Basic syntax

pattern { action }
pattern { action }
pattern { action }
...

A pattern usually matches if parts of a line match (this can be processed later if needed) => aka. a record.

/Hello/
# ==
/Hello/ {print}
# ==
/Hello/ {print $0}

Default Behavior:

Pattern: If it matches, the entire line is printed
No pattern provided: Every line is printed

Separation

Default separator = whitespaces => aka. a field.

echo "hello world" | awk '{print $2}'
# prints: `world` ($0 = whole line, $1 = first column considering separation, ...)

Changing the separator:

echo "one|two|three" | awk -F| {print $2}'
# or
echo "one|two|three" | awk 'BEGIN {FS="|"} {print $2}'

Note: The separator used RegExes (regular expressions); if you want to separate for reserved regex characters you must escape them (via \; e.g. for . -> \.).

FS variable works in scripts as well:

# test_sep1.awk
# BEGIN block (actions before processing)
BEGIN {
    FS = "|"
}

# Main block
{
    print $2
}

# Shown for illustration; can be omitted since empty:
END {
}

Execute (aka run the awk file): echo "one|two|three" | awk -f test_sep1.awk or awk -f test_sep1.awk input.txt

Variables

FS: Field separator (default: whitespace)
OFS: Output field separator (default: space)
RS: Record separator (default: newline)
ORS: Output record separator (default is a newline)
NR: Number of records processed so far
NF: Number of fields in the current record
$0: The entire current record
$1, $2, …: The individual fields of the current record

TODO: add explanations for output parts

RegEx matching

match(string, regexp [, array])

-> array is an array of matched groups

array[0] is the whole match,
array[1]`, the first group,
...

Example:

echo "one tw_#_o three" | awk '{match($2, /\w+(_#_)(\w+)/, ary)} { print ary[1] }'
#              pattern -------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^------- main block (action)
# prints: `_#_`

# what would also work:
echo "one tw_#_o three" | awk '{match($2, /\w+(_#_)(\w+)/, ary)} { some_var = ary[1]; print some_var }'
# or:
echo "one tw_#_o three" | awk '{match($2, /\w+(_#_)(\w+)/, ary)} { $2 = ary[1]; print $2 }'

Substitution/Replacement

gsub(regex, replacement, [target])

-> target: Input for replacement (default: `$0)

echo "one tw_#_o three" | awk '{ gsub(/_#_/, "", $2); print $2 }'         # Here no pattern defined (takes every line)
# prints `two`

By default it always overwrites the target; if you do not want that you must assign it to a variable first:

echo "one tw_#_o three" | awk '{ new_var = $2; gsub(/_#_/, "", new_var); print new_var }'

Interesting functions and applications

Skip an uninteresting line

 awk '/^\/\// {next} // { print }' ./someFile.txt

Print text aligned

Prints column 1 and 4 nicely separated by spaces (the syntax is similar to C's printf):

awk '{ printf("%-40s%s\n", $1, $4) }'

TODO

$ grep -R "CI_TYPE =" ../modules/* | grep -v dummy | awk 'match($1, /modules\/(.*)\/main\.tf/, ary) { $1 = ary[1]; gsub(/"/,"", $4); printf("%-40s %s %.5f\n", $1, $4, 5); }'

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License *.

Code (snippets) are licensed under a MIT License *.

* Unless stated otherwise

Home

Python 3

^(un)fold

Snippets

General

Libs

Linux/bash

^(un)fold

Guides

Scripts

Git

^(un)fold

C/C++

^(un)fold

Video

^(un)fold

Databases

^(un)fold

PostgreSQL

Misc

^(un)fold

Windows

^(un)fold

Mac

^(un)fold

General

SW recommendations

^(un)fold

(Angular) Dart

^(un)fold

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

awk

AWK

General

Basic syntax

Separation

Variables

RegEx matching

Substitution/Replacement

Interesting functions and applications

Skip an uninteresting line

Print text aligned

TODO

Home

Python 3

Linux/bash

Git

C/C++

Video

Databases

Misc

Windows

Mac

SW recommendations

(Angular) Dart

Becoming a Software Eng

Clone this wiki locally