Grappling with disk space

When you find yourself suddenly faced with a disk or file system that is 100% full or close to it, the first thing you generally want to do is alleviate the "out of space" condition. However, finding the source of the disk space crunch may take some time. Let's look at some handy commands for figuring out what you can do quickly to keep the system operational.

Strategies that might provde useful for finding the files that are causing the disk space crunch involve:

  • finding recently added and recently grown files
  • finding especially big files and directories
  • looking for large log files, especially older logs that may have been rolled over (e.g., messages.3) and can be easily deleted
  • compressing large files that are not currently in use (tar files, etc)

Both newly added files and those recently modified will have a recent mtime (modification time) stamp. If you track down this type of file, you might be able to figure out why disk space has been used up recently.

# cd /filesystem
# find . -mtime -1 -ls

This command will tell you what files have been added or changed in the previous 24 hours. With the -ls option instead of the more common -print, it gives you a long listing so that you can see the file size and owner. This command might point you at some newly created directories that look benign enough, but a quick "du -sk" of their contents might show that they are a significant contributor to your disk space woes.

20416175  1 drwxr-xr-x   7 jdoe   users    512 Sep 24 09:36 planning/PSQL_Data
22710997  1 drwxr-xr-x   4 jdoe   users    512 Sep 24 09:36 planning/PSQL_Data/fy2009

If you check for new files periodically, you might like the -newer argument which you can use to find all files that are newer than a reference file. For example, the command below will find all files that have been created or modified since the timestamp file was last changed.

# find . -newer /opt/appl/timestamp

You can also look for especially large files by providing a file size as an argument.

# find . -size +1000000 -ls
76435     0 --rw-r--r--   1 datahog  users 19493236306 Sep 24 13:42 biglog

Prints files with more than 1,000,000 blocks. The same command with +1000c would look for files with 1,000,000 or more characters (i.e., much smaller).

You can look for directories and get a rough idea of how many files they contain by using the -type d argument. Directories that contain relatively few files will have only used one 512 byte block to store their contents, like these:

# find . -type d -ls | more
746555    1 drwxr-xr-x   2 jdoe   users    512 Jun 25 14:01 ./newdata
7101801   1 drwxr-xr-x   2 jdoe   users    512 Sep  5 14:11 ./planning
8611619   1 drwxr-xr-x   2 jdoe   users    512 Jul  9 14:42 ./config

Directories with considerably more files will have allocated additional blocks. The directories in the listing below.

  6356    2 drwxr-xr-x   2 jdoe   users   1536 May 28  2003 ./data1
11672239  2 drwxr-xr-x  23 jdoe   users   2048 Jun 11 15:17 ./data2
16223365 37 drwxr-xr-x   5 jdoe   users  37888 Jul 14 14:55 ./data3
16233599 27 drwxr-xr-x   5 jdoe   users  27648 Aug 14 14:55 ./data4
16370153 29 drwxr-xr-x   5 jdoe   users  29696 Sep 14 14:55 ./data5

The largest of the directories shown above (./data3), for example, is using seventy-four 512-byte blocks. This is an unusually large directory file.

The directory size gives you an idea about the number of files that a directory contains, but not the size of the directory (i.e., its contents). You could use a command such as this to determine how much disk space each directory contains. You will, however, see a line for each directory, subdirectory etc. You can end up with a lot of data to look through.

# find . -type d | xargs du -sk 

If your users extract files from tar files and never bother repackaging the files into their original compressed format, you might be able to save a lot of disk space simply by running a command like this. The -name argument in this case is used to make sure we don't compress executables or data files that might be required in their current uncompressed form. It's generally "safe" to compress tar files since the likelihood of repurcussions is very slim.

# find . -name "*.tar" -size +100 | xargs gzip

This article is published as part of the IDG Contributor Network. Want to Join?

To express your thoughts on Computerworld content, visit Computerworld's Facebook page, LinkedIn page and Twitter stream.
7 Wi-Fi vulnerabilities beyond weak passwords
Shop Tech Products at Amazon