EZ file extraction on Unix

extract
Credit: flickr / Scott Schiller

The only thing that's really hard about extracting from archives on Unix systems is remembering all of the commands and the required options.

When you have ten or more possible archive types and could encounter any of them, you just might run out of space in your frontal cortex and need to restore to searching online for some examples of the correct syntax to use. One way to get past this problem is to stuff all the commands you might ever need into a script and make sure the script is clever enough to use the right command for any archive that you are likely to ask it to handle.

In the script presented below, we're going to do several things: 

  1. Make sure that the person running the script supplies the archive file's name;
  2. Make sure the specified archive actually exists, and;
  3. Select the appropriate extraction command and arguments for extracting the archive's contents when you run the selected extract command.

Make sure a file name is provided

If the person running the script doesn't provide a file name as an argument, we prompt for one.

if [ $# != 1 ]; then
  echo -n "file> "
  read file
else
  file=$1
fi

Check that the specified file exists

If the file doesn't exist (or isn't a file), issue an error and exit with a non-zero return code.

if [ ! -f $file ]; then
  echo "No such file: $file"
  exit 1
fi

Determine file type by evaluating file extensions

Use a case statement to match the pattern of any file name presented. For example, *.tar will match any regular tar file and *.bz2 will match any bzip2 file.

case $file in
  *.bz2)     bunzip2 $file;;
  *.gz)      gunzip $file;;
  …
esac

The script

Here's the script in one piece for easy copying

The case statement is arranged with the list of possible file extensions roughly in alphabetical order. The appropriate extract commands are then run depending on the file extensions of the archive files you use it for.

#!/bin/bash

if [ $# != 1 ]; then
  echo -n "file> "
  read file
else
  file=$1
fi

if [ ! -f $file ]; then
  echo "No such file: $file"
  exit 1
fi

case $file in
  *.bz2)     bunzip2 $file;;
  *.gz)      gunzip $file;;
  *.rar)     rar x $file;;
  *.tar)     tar xf $file;;
  *.tar.7z)  7z e $file;;
  *.tar.bz2) tar xjf $file;;
  *.tar.gz)  tar xzf $file;;
  *.tbz2)    tar xjf $file;;
  *.tgz)     tar xzf $file;;
  *.zip)     unzip $file;;
  *.Z)       uncompress $file;;
  *)         echo "$file is of unknown file type";;
esac

You might consider adding a confirmation of a successful extraction at the end -- something like this:

if [ $? == 0 ]; then
  echo $file extraction complete
fi

Another option is to add the verbose options to the commands where they exist (e.g., tar xvf). I prefer not to do this as the file lists are generally too much to look at.

If the extraction fails (e.g., if the archive file is corrupt or misnamed), the extraction command selected will generate errors like “./extractFile: line 22: rar: command not found”, so you don't need to add a warning to indicate when an extraction fails.

This script should be trimmed down if you don't have all the extraction commands shown or you'll get some errors if you try to extract from an archive when you don't have the specified extraction command installed. For example, you might not have 7z on your system. If that's the case, remove that line from the script or comment it out.

You can also add lines to the script if you any archive file types have been omitted. And it's always a good idea to verify that a script works properly before depending on it when you're in the middle of some important work with your customers breathing down your neck.

This article is published as part of the IDG Contributor Network. Want to Join?

To express your thoughts on Computerworld content, visit Computerworld's Facebook page, LinkedIn page and Twitter stream.
Windows 10 annoyances and solutions
Shop Tech Products at Amazon
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.