Profiling directory contents with Unix commands

profile don obrien
Credit: flickr / Don O'Brien

One of the ways to determine how useful a directory and its contents are or to produce a summary that gives you a sense of what the contents are about is to collect statistics on the number of files, their ages and sizes, and their permissions. In this post, we'll look at some commands for producing useful stats.

Find the oldest file

The printf option of the find command allows you to do some more specific searches than you might have been doing in the past. In the command below, for example, we are searching for files and ordering them by their time stamps by using a special option that allows us to organize our listings based on a year-month-day organization of the date/time information.

By ordering files on the date/time information and then selecting only the first file listed, we display the oldest file in the directory and below.

$ sudo find -type f -printf '%T+ %p\n' | sort | head -n 1
2015-03-04+05:16:34.0000000000 ./.bash_logout

Without a starting point, find is going to begin its search in the current direcory. You can start with a particular directory by just adding it to the find command.

$ sudo find /var/log -type f -printf '%T+ %p\n' | sort | head -1
2013-03-14+09:39:31.0000000000 /var/log/mail/statistics

As you can see, we're printing the file's date/time stamp in a format that allows us to easily sort on our files' ages and then printing the file with the smallest year-month-day+time metadata. The %T in the command above corresponds to this format while the %p\n represents the file's name followed by a linefeed.

Find the newest file

Finding the newest file requires nearly the same command, except that we then display the last file in the list that find creates rather than the first one.

$ sudo find -type f -printf '%T+ %p\n' | sort | tail -1
2015-08-30+16:40:41.8810055290 ./.bash_history

Find the largest file

$ find -type f -printf '%s\t%P\n' | sort -n | tail -1
6730    simple

And, of course, a very similar command lets you find the smallest file.

$ find -type f -printf '%s\t%P\n' | sort -n | head -1
0       junk

Counting files

Counting the files in a given directory can be accomplished by tacking a wc command on to the end of your find command.

$ sudo find /opt/tools -type f | wc -l

Counting files by year (last updates)

For some folders, counting the number of files that have been added each year can tell you something important about that directory's use. In the command below, we use awk's field separator option (-F) to select on just the year in each output line from the find command.

$ sudo find /etc -type f -printf '%T+ %p\n' | awk -F- '{print $1}' | sort | uniq -c
      1 2004
      1 2007
      8 2010
      7 2011
     10 2012
     89 2013
    122 2014
    217 2015

Looking for files based on permissions

The find command's permissions specifier is just a bit tricky to use. It provides options to look for exact matches, partial matches, and matches by particular users (i.e., user, group, or other).

Finding exact matches

Use the -perm 644 form of the command to find files that match the permission specified exactly.

$ find . -perm 640 -ls
399259    8 -rw-r-----   1 ec2-user ec2-user     4690 Sep 13 21:14 ./.bash_history
413209    4 -rw-r-----   1 ec2-user ec2-user      139 Aug 30 15:01 ./.bashrc
410793    4 -rw-r-----   1 ec2-user ec2-user      176 Mar  4  2015 ./.bash_profile
410794    4 -rw-r-----   1 ec2-user ec2-user       18 Mar  4  2015 ./.bash_logout

You can find files by exact permissions and do the opposite -- find files that don't match that specific pattern. So, if this command files all files with all permissions bit set:

$ sudo find / -type f -perm 0777

This one finds files that don't have all permissions bits set.

find / -type f ! -perm 777

Use the -perm /644 form of the command to find ANY of the permissions bits specified. In the example below, we match any execute bit that is set.

$ find . -type f -perm /111 -ls
412716    4 -rwx------   1 ec2-user ec2-user       44 Aug 22 19:13 ./maybe.awk
412714    4 -rwx------   1 ec2-user ec2-user      153 Aug 22 19:14 ./maybe
413205    4 -------rwx   1 ec2-user ec2-user       32 Aug 30 00:41 ./exp0
412718    4 -rwx------   1 ec2-user ec2-user       49 Aug 30 00:49 ./exp2
412721    4 -rwx------   1 ec2-user ec2-user      436 Aug 29 21:42 ./try1
412251    8 -rwxrwxr-x   1 ec2-user ec2-user     6730 Aug  9 18:43 ./simple

Finding files with world execute

In the command below, we're looking for files that have the execute bit set for the world. The "o" is the symbolic form of matching the world -- the same as if oten used in chmod commands where u=user, g=group, and o=other.

$ find . -type f -perm /o=x -ls
412251    8 -rwxrwxr-x   1 ec2-user ec2-user     6730 Aug  9 18:43 ./simple

Displaying owner and group

Going back to the printf options, we can also count files by owner or group using the %u or %g parameters. In the example below, we are counting files by the users and groups associated with the files. It's not surprising that root owns most of the files in the /var partition.

Counting by owner ...

$ sudo find /var -type f -printf '%u\n' | sort | uniq -c
      1 daemon
      1 ec2-user
     97 mysql
      1 ntp
    631 root
      1 shs
      1 smmsp

Counting by group ...

$ sudo find /var -type f -printf '%g\n' | sort | uniq -c
      1 daemon
      2 mail
     97 mysql
      1 ntp
    626 root
      2 smmsp
      4 utmp

Finding files with SGID (set group ID) bits set

$ sudo find / -perm 2644 -ls
  1646    0 -rw-r-Sr--   1 root     root        2614 Sep 14 01:35 /tmp/oops

Putting all of these commands together, you can pull together a lot of information on the files within a directory such as who owns them, what the oldest and newest files are, and specific permissions that you might want to know about.

Computerworld's IT Salary Survey 2017 results
Shop Tech Products at Amazon