Getting system insights with dstat

stats nils ze
Credit: flickr / Nils Ze
RELATED TOPICS

Though you may not have heard of it, the Linux dstat command is an especially useful tool. It combines the power and insights of commands like vmstat, iostat and ifstat and is one of the best tools for examining system resources and one of the most versatile.

Chances are, you will have to install it on your server. Of course, a simple yum install dstat or similar command should have you ready in minutes.

The dstat command is itself actually a python script. So, yes, you also need to have python installed on your system to use it and there are certain release constraints. Be aware that dstat requires python 2.2 or later and a kernel version of 2.6.20 or better.

$ file /usr/bin/dstat
/usr/bin/dstat: Python script, ASCII text executable

You can inquire about your version of python quite easily using the -V option as shown below.

$ python -V
Python 2.7.9

And, of course, information on your operating system version is available with the uname command.

$ uname -r
3.14.35-28.38.amzn1.x86_64

If you run the dstat command without any arguments, the default output will look like what you see below. You might recognize many of the stats from other commands you use routinely/ I suspect, though, that you might not be running it on a system with nearly as little load as the one on which I ran this command.

$ dstat
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
  0   0 100   0   0   0|  21B 1150B|   0     0 |   0     0 |   9    31
  0   0 100   0   0   0|   0     0 |  80B  916B|   0     0 |  11    13
  0   0 100   0   0   0|   0     0 |  40B  452B|   0     0 |   9    15
  0   0 100   0   0   0|   0     0 |  40B  452B|   0     0 |  13    19
  0   0 100   0   0   0|   0     0 |  40B  452B|   0     0 |   8    13 ^C

As you can see, the default dstat output shows a number of very useful system stats related to CPU usage, disk activity, network traffic, memory paging, and system stats (interrupts and context switches). I like the way it uses two levels of titles to label the columns of data.

You can select, instead of an ongoing display (notice that I exited with a control-C in the example above), to view a limited set of measurements by providing two arguments as in the example below. When you do, an interesting thing happens. Each line will be written and then overwritten until the number of seconds you have specified have been reached. Below, we are asking for two 10-second averages. The first line shows the current numbers. The second two lines are the 10-second averages.

$ dstat 10 2
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
  0   0 100   0   0   0|  21B 1150B|   0     0 |   0     0 |   9    31
  0   0 100   0   0   0|   0   819B|  53B  419B|   0     0 |  11    16
  0   0 100   0   0   0|   0  3277B|  48B  362B|   0     0 |  11    16

This display corresponds to the -cdngy options. That is, it's the same output that you would get if you typed dstat -cdngy. What are all these letters? They are some of the many options the command provides.

The variety of data available for viewing with dstat is fairly extensive. To generate an easy to use listing of the available options, just type dstat -h (h for "help") and you'll get a listing of options that starts something like this (but this list is truncated):

$ dstat -h
Usage: dstat [-afv] [options..] [delay [count]]
Versatile tool for generating system resource statistics

Dstat options:
  -c, --cpu              enable cpu stats
     -C 0,3,total           include cpu0, cpu3 and total
  -d, --disk             enable disk stats
     -D total,hda           include hda and total
  -g, --page             enable page stats
  -i, --int              enable interrupt stats
     -I 5,eth2              include int5 and interrupt used by eth2
  -l, --load             enable load stats
  -m, --mem              enable memory stats
  -n, --net              enable network stats
     -N eth1,total          include eth1 and total
  -p, --proc             enable process stats
  -r, --io               enable io stats (I/O requests completed)
  -s, --swap             enable swap stats
     -S swap1,total         include swap1 and total
  -t, --time             enable time/date output
  -T, --epoch            enable time counter (seconds since epoch)
  -y, --sys              enable system stats

As you can see from this list, "cdngy" represents CPU, disk, network, paging, and system stats. You can also add time and date fields with the t option. Viewing CPU and disk stats along with date/time values, for example, requires using the t, c, and d options. And you can display these fields in any order you like.

$ dstat -tcd
----system---- ----total-cpu-usage---- -dsk/total-
  date/time   |usr sys idl wai hiq siq| read  writ
16-02 20:14:09|  0   0 100   0   0   0|  21B 1152B
16-02 20:14:10|  0   0 100   0   0   0|   0     0
16-02 20:14:11|  0   0 100   0   0   0|   0     0
16-02 20:14:12|  0   0  99   1   0   0|   0    24k
16-02 20:14:13|  0   0 100   0   0   0|   0     0
16-02 20:14:14|  0   0 100   0   0   0|   0     0 ^C 

So, here we have time, CPU, and disk figures. The date/time figure is a little odd as it's in day-month format. The number 16-02 means February 2nd. And the time reflects the time zone that is set on your system.

There's quite a lot of detail in this fairly compact display. What are we looking at here?

CPU usage -- usr, sys, and idle -- the amount of time spent in each of these states -- running user processes, running system processes and twiddling its thumbs (basically sitting idle and waiting for something to do).

the wai column shows time spent waiting. This is different than idle time in that the CPU is specifically waiting for some process to get back to it. It might be waiting for data being fetched from the disk.

The hiq and siq figures represent two different types of interrupts. Think of interrupts as the kind of thing that happens when the CPU is being nudged because the person sitting at the keyboard has just pressed a key.

You can also elect to write your data in csv format -- extremely handy if you want to import it into a spreadsheet. To do this, just use the --output option and follow it with a file name. You'll still get to see the output displayed in the normal format while the same data in CSV format is accumulated in your specified data file.

$ dstat --output stats.`date +%m-%d-%y`
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
  0   0 100   0   0   0|  21B 1140B|   0     0 |   0     0 |   9    31
  0   0 100   0   0   0|   0     0 |  80B  916B|   0     0 |  13    19
  0   1  99   0   0   0|   0     0 |  40B  452B|   0     0 |  10    11
  0   0 100   0   0   0|   0     0 |  40B  452B|   0     0 |  10    11
  0   0 100   0   0   0|   0    64k|  40B  452B|   0     0 |  18    26
  0   1  99   0   0   0|   0     0 | 116B  506B|   0     0 |  13    14
  0   0 100   0   0   0|   0     0 |  40B  452B|   0     0 |  10    15 ^C

Now, let's look at the file we just generated.

$ ls stats*
stats.02-15-16
$ cat stats.02-15-16
"Dstat 0.7.0 CSV output"
"Author:","Dag Wieers <dag@wieers.com>",,,,"URL:","http://dag.wieers.com/home-made/dstat/"
"Host:","ip-172-30-0-28",,,,"User:","ec2-user"
"Cmdline:","dstat --output stats.02-15-16",,,,"Date:","15 Feb 2016 22:46:24 UTC"

"total cpu usage",,,,,,"dsk/total",,"net/total",,"paging",,"system",
"usr","sys","idl","wai","hiq","siq","read","writ","recv","send","in","out","int","csw"
0.012,0.007,99.977,0.003,0.000,0.000,20.941,1140.225,0.0,0.0,0.0,0.0,8.951,30.699
0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,80.0,916.0,0.0,0.0,13.0,19.0
0.0,0.990,99.010,0.0,0.0,0.0,0.0,0.0,40.0,452.0,0.0,0.0,10.0,11.0
0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,40.0,452.0,0.0,0.0,10.0,11.0
0.0,0.0,100.0,0.0,0.0,0.0,0.0,65536.0,40.0,452.0,0.0,0.0,18.0,26.0
0.0,0.990,99.010,0.0,0.0,0.0,0.0,0.0,116.0,506.0,0.0,0.0,13.0,14.0
0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,40.0,452.0,0.0,0.0,10.0,15.0

The dstat command makes it easy to select just about any data you want to see. The command below looks at IO requests and swap stats. The options aren't always obvious (r for "io requests"), but you will probably get used to the options that seem the most useful to you (or assign them to aliases with commands like alias IOswap="dstat -rs 5".

$ dstat -rs 5
--io/total- ----swap---
 read  writ| used  free
0.00  0.12 |   0     0
   0     0 |   0     0
   0  0.80 |   0     0
   0     0 |   0     0

The 5 in the command above is specifying that we want to see 5-second averages. As in a previous example, a second number would represent the number of iterations that you want to see.

In the command below, we look at disk and network activity, interrupts, context switches, and CPU usage. Each line of output represents 5 seconds and shows 4 intermediate displays.

$ dstat -dnyc 5
--dsk/xvda- --net/eth0- ---system-- ----total-cpu-usage----
 read  writ| recv  send| int   csw |usr sys idl wai hiq siq
  10B  575B|   0     0 |   9    31 |  0   0 100   0   0   0
   0  6554B|  48B  433B|  10    14 |  0   0 100   0   0   0
   0     0 |  50B  336B|  10    15 |  0   0 100   0   0   0
   0  6554B|  40B  308B|  11    15 |  0   0 100   0   0   0
   0     0 |  55B  336B|   9    11 |  0   0 100   0   0   0
   0   819B|  40B  308B|  13    20 |  0   0 100   0   0   0
   0  3277B|  40B  335B|   9    13 |  0   0 100   0   0   0
   0  4915B|  64B  358B|  14    21 |  0   0 100   0   0   0
   0     0 |  55B  336B|  10    12 |  0   0 100   0   0   0
   0     0 |  40B  322B|  11    18 |  0   0 100   0   0   0

If you really get into dstat, it's probably useful to know that the basic command can be augmented by adding plugins. In fact, you can even write your own plugins if you're so inclined. An example plugin (helloworld) is being run in the example below. This one was part of the basic dstat installation.

$ dstat --helloworld
plugin-title
  counter
Hello world!
Hello world!
Hello world!^C

OK, so helloworld might not add tremendous insight to your troubleshooting and performance measurements, but might be a starting point if you need even more from dstat than you get out of a basic install.

 

 

This article is published as part of the IDG Contributor Network. Want to Join?

RELATED TOPICS
The brave new world of Windows 10 license activation
View Comments
Join the discussion
Be the first to comment on this article. Our Commenting Policies