Doing math with awk

math jeremy mikkola

I use awk all the time, but generally only to conveniently pull a particular field out of data that I'm workin with. Regardless of the separator used, awk makes it easy to extract just what you need. But there's a lot more to the language than this obvious feature. One of the services that awk can provide is the ability to do a range of mathematical calculations -- like cosines and square roots -- more easily than you might imagine.

First, try this. You can print the whole number part of a number (i.e., not rounding it up), by doing something like this:

$ awk 'BEGIN{
> print int(12.789);
> }'
12

Obviously, you're not going to fall of your seat if you do this on the command line, but this same logic can come in very handy when you're writing a script.

You can print the logarithm of a number just as easily.

$ awk 'BEGIN{
> print log(111)
> }'
4.70953

And how about printing the square root of a number?

$ awk 'BEGIN{
> print sqrt(25)
> }'
5

You can also build awk scripts to perform the calculations on a file full of numbers as easily as doing a calculation on one.

For integers:

{
  print int($1);
}

that script will print the integer portion of the numbers at the beginning of every line in a file.

For square roots:

{
  print sqrt($1);
}

And, if the number aren't the first thing on each line, change $1 to $2 or $3, etc.

Running our scripts after looking at the numbers file:

$ cat numbers
12.345
54.321
4
25
$
$ awk -f getInt numbers
12
54
4
25
$ awk -f getSqRt numbers
3.51355
7.37028
2
5

And, if the separators aren't some form of white space, specify your field separator in a BEGIN statement as shown below:

$ cat getSqRt2
BEGIN {FS=":"}
{
  print sqrt($2);
}
$

In the lines below, we are running the script shown above on a file of numbers in which the number is the second field and the fields are separated with colons.

First, the numbers

$ cat nums
top:12.345
bottom:54.321
right:4
left:25

Then the results:

$ awk -f getSqRt2 nums
3.51355
7.37028
2
5
  • int(x) -- the nearest integer to x
  • sqrt(x) -- the positive square root of x
  • exp(x) -- the exponential of x (e ^ x) or an error if x is out of range
  • log(x) -- the natural logarithm of x (as long as x is positive)
  • sin(x) -- the sine of x
  • cos(x) -- the cosine of x
  • atan2(y, x) -- the arctangent of y / x
  • rand() -- a random number -- always between zero and one
  • srand(x) -- set the starting point for random numbers

Finally, here's an implementation of the "maybe" command that some Unix folks were laughing about some years ago. The maybe command will vary from "yes" to "no". This implementation is done with an awk command. It will randomly respond with a "yes" or a "no" -- kind of the equivalent of tossing a coin.

#!/bin/bash

ans=`awk -vmin=0 -vmax=1 'BEGIN{srand(); print int(min+rand()*(max-min+1))}'`

if [ $ans -eq 0 ]; then
    echo "no"
else
    echo "yes"
fi

The book "sed & awk" was one of the first O'Reilly books I ever read and these tools played a key role in drawing me into the Unix world decades ago. But awk is more fully functional than I sometimes remember. I get too used to just using it to selecting columns from arbitrary data and forget sometimes how much processing it can do on its own.

This article is published as part of the IDG Contributor Network. Want to Join?

To express your thoughts on Computerworld content, visit Computerworld's Facebook page, LinkedIn page and Twitter stream.
Windows 10 annoyances and solutions
Shop Tech Products at Amazon
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.