Data and Image Compression

Compression is a way of encoding digital data so it takes up less storage space and requires less network bandwidth to be transmitted. There are two basic types of compression: lossy methods, in which some data is lost when the files are decompressed, and lossless methods, in which no data is lost when the files are restored to their original format.

As long as bandwidth is expensive and people are impatient, data compression is here to stay. Simply put, compression methods crunch data - text, graphics, audio or video - into a computer-decipherable shorthand that's 10% to 99% of its original size. The data takes up less storage space and requires less bandwidth to be transmitted over the Internet. In addition, many methods can squeeze multiple files into a single file called an archive.

There are two basic methods for compressing data: lossy and lossless. Lossless techniques compress data without destroying or losing anything during the process. When the original document is decompressed, it's bit-for-bit identical to the original. Lossy techniques let the file be compressed even smaller, but some data is lost forever.

Compression Standards

Joint Photographic Experts Group (JPEG): A still-image, lossy compression method that uses discrete cosine transform equations to compress images at a ratio of up to 20-to-1, without noticeable loss.

Lempel-Ziv-Welch: This algorithm, used in many compression formats, including graphics interchange format and Tag Image File Format, takes each input sequence of bits of a given length and creates an entry in a lookup table, along with a shorter code. Lookup entries are part of the compressed file, enabling the decoding program to rebuild the table.

Moving Pictures Experts Group (MPEG): A lossy compression method for video. MPEG-1 is used for CD-ROMs and video CDs. MPEG-2 compresses video for regular and high-definition television.

MPEG Audio Layer 3 (MP3): An audio compression technology that's part of MPEG-1 that uses perceptual audio coding to compress CD-quality audio by a factor of 12.

Fractal: A lossy compression method for color images that's well suited for natural objects, with compression ratios up to 100-to-1.

PKZip: A popular lossless compression shareware program from PKWare Inc. The program uses an algorithm and a data library to encode or archive multiple date files. PKUnzip decompresses the files to their original states.

Wavelet: This form of lossy compression uses a mathematical function that can compress images to a greater extent than other methods - sometimes to only one-fourth the size of a similar image compressed with JPEG.

Windows Media Technology: This Microsoft Corp. product delivers better sound quality than MP3 for same-size files, as well as near-DVD-quality video.

- Lee Copeland and Russell Kay


"Lossy compression makes a trade-off: You give up accuracy for higher compression," says Steve Hoffenberg, director of product management at Sound Vision Inc., a digital imaging firm in Framingham, Mass. "If you're compressing bank records, you want to be sure they're identical before and after. But with an image or audio or video files, it's generally not crucial to restore every bit of data."

Lossless Compression

Lossless data-compression programs search through documents for redundant or repetitive data and then encode it. For example, a text document may contain 200 empty spaces, 100 instances of the word in and 50 instances of the words the defendant said. The program searches for these repetitive words, phrases and spaces and then replaces them with an abbreviated bit pattern or numerical symbol for each one, which it stores in a "dictionary."

When the file is decompressed, the bit patterns are decoded and the data is restored. No data is lost or altered.

There are many data compression programs, such as StuffIt (for Macintosh computers) by Aladdin Systems Inc. in Watsonville, Calif.; WinZip (for Windows) by WinZip Computing Inc. in Mansfield, Conn.; and PKZip by PKWare Inc. in Brown Deer, Wis. PKZip is the most popular program for DOS and Windows compression.

Jim Peterson, PKZip engineering manager at PKWare, says Internet growth makes data compression important.

"It's important to compress data, because storage space comes at a price and bandwidth comes at a price," says Peterson. Typical data documents such as Microsoft Excel spreadsheets or PowerPoint presentations can be squeezed to half their original size, but documents with a high degree of repetition and numbers can be compressed to 20% of their original size.

Lossy Compression

For graphics, video and audio signals, lossy compression is most commonly used. Audio and video can be compressed to 5% of their original size using lossy compression, but the data loss is seldom detectable to the human eye or ear at this level. For example, a lossy image may be less sharp than the original, making blades of grass appear blurred.

Another factor, says Carl Garland, an analyst at Current Analysis Inc. in Sterling, Va., is that the physical copper wire network can't handle uncompressed audio and video signals well. "The last mile, from the switch location to the customer's house or business office, is handed off to [a] twisted-copper pair, and there are a lot of bandwidth constraints, because data is transmitted at different speeds," says Garland.

For example, for a spreadsheet, it doesn't matter which data packets arrive first, he says. But for a phone call or video stream, rapidly changing transmission speeds would distort the sound quality or image.

Copyright © 2000 IDG Communications, Inc.

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon