Using AWK in Bash Scripts

In the past Python has been my go to language for quick scripts, however lately I’ve done a lot of projects where I’ve needed to use small Bash scripts.

I found that by adding a little bit of AWK to my Bash scripts I’ve been able to do something in one line of Bash/AWK that would of taken me multiple lines of Python.

AWK is named after it’s authors: Alfred Aho, Peter Weinberger, and Brian Kernighan, and it is an old school (1994) text extraction and reporting tool.

The nice thing about AWK is that you only need to learn a couple of commands to make it usefully.

A Bash/AWK Example

I was setting up a home automation system using a Raspberry Pi with Home Assistant. I was loading a lot of add-on components and I was worried that I might be overloading the Pi so I wanted to monitor the CPU idle time.

Home Assistant supports a command line sensor, so I just needed to get the idle time out of the iostat command:

The idle time value was the 6th item on the 4th line. The AWK code to get only the idle time is:

~$ iostat | awk '{if (NR==4) print $6}'
96.92

AWK logic need to be in single quotes and curly brackets groups together statements. This logic says: if the Number of Record (NR) variable is 4 print the 6th item.

Once I was able to get the idle time I was able to add a sensor in the HA (/config/configuration.yaml) that could be monitored and alarmed:

 sensor:

 platform: command_line 
 name: Idle Time
 command: "iostat | awk 'NR==4' | awk '{print $6}'"
 unit_of_measurement: "%" 

Some Useful AWK Statements

There is an enhanced version of AWK, GAWK (GNU AWK) that might already be loaded on your Linux system. If you are on a Raspberry Pi you can install GAWK by:

sudo apt-get install gawk

There are some excellent tutorials on AWK below are some of commands that I’ve found useful:

substr(string,position,length) – get part of a string:

An example of substr could be used to get the CPU temperature from the sensors utility:

~$ sensors | grep CPU | awk '{print substr($2,2,4)}'
 44.0

The substr() command looks at the 2nd item (+44.0°C), and starts at the 2nd character and it gets 4 characters.

print() with if() – print based on conditions:

The AWK print statement can be used with an if statement to show a filtered list.

An example of this would be to filter the ps (snapshot of the current processes) command, and print only lines with a time showing:

~$ # SHOW ALL PROCESSES
~$ ps -e 
   PID TTY          TIME CMD
     1 ?        00:00:03 systemd
     2 ?        00:00:00 kthreadd
     4 ?        00:00:00 kworker/0:0H
     6 ?        00:00:00 mm_percpu_wq
     7 ?        00:00:00 ksoftirqd/0
     8 ?        00:01:10 rcu_sched
...
~$  # SHOW ONLY PROCESSES WITH TIME
~$ ps -e | awk '{if ($3 != "00:00:00") print $0}'
   PID TTY          TIME CMD
     1 ?        00:00:03 systemd
     8 ?        00:01:10 rcu_sched
    10 ?        00:00:06 migration/0
    15 ?        00:00:03 migration/1
...

systime() / strftime() – get time and format time:

These time functions allow you to add time stamps and then do formatting on the date/time string. I found this useful in logging and charting projects. An example to add a time stamp to the sensor’s CPU temperature would be:

$ sensors | grep CPU | awk '{print strftime("%H:%M:%S ",systime()) $1 $2 }'
 11:06:18 CPU:+45.0°C

Final Comments

I’ve found that learning a little bit of AWK has really paid off.

AWK supports a lot of functionality and it can be used to create full on scripting applications with user inputs, file I/O, math functions and shell commands, but despite all this I’ll stick to Python if things get complex.