bzip2 - Linux


Overview

bzip2 is a command-line based file compression tool that uses the Burrows–Wheeler algorithm and Huffman coding. It is particularly effective for compressing text files or large data sets and often achieves better compression ratios than traditional gzip/zlib. Because of its high compression ratio, bzip2 is widely used for tasks involving archiving, backing up data, or reducing the size of files for transmission.

Syntax

The basic usage of bzip2 is:

bzip2 [options] [files]
  • [files] refers to one or more files that you want to compress. If no file is specified, bzip2 compresses from standard input.

Decompression

To decompress files, you use:

bzip2 -d [options] [files]

or

bunzip2 [options] [files]

Options/Flags

  • -d, --decompress: Decompress the specified file(s).
  • -z, --compress: Force compression, even if it is already compressed. This is the default action.
  • -k, --keep: Keep (do not delete) the input files during compression or decompression.
  • -f, --force: Overwrite existing output files.
  • -t, --test: Check the integrity of the compressed file.
  • -v, --verbose: Verbose mode; show the compression ratio for each file processed.
  • -q, --quiet: Suppress noncritical error messages.
  • -c, --stdout: Output to standard out; keeping the original files unchanged.
  • -s, --small: Reduce memory usage, useful for systems with limited memory but can increase time for the compression/decompression process.

Examples

  1. Simple Compression:

    bzip2 filename.txt
    

    This command compresses filename.txt to filename.txt.bz2 and deletes the original file.

  2. Decompressing a File:

    bzip2 -d filename.txt.bz2
    

    This will decompress the file and the original filename.txt.bz2 file will not be retained.

  3. Compress Multiple Files:

    bzip2 *.txt
    

    Compress all .txt files in the current directory, each into its own .bz2 archive.

  4. Compress and Keep Original Files:

    bzip2 -k photos.tar
    

    Compresses photos.tar and retains the original.

  5. Test Integrity of a Compressed File:

    bzip2 -t archive.bz2
    

    Checks if the archive is intact and can be successfully decompressed.

Common Issues

  • Memory Usage: Compressing very large files on systems with limited memory should be done using the -s option.
  • File Overwrite: By default, bzip2 does not overwrite existing files. Users may unknowingly leave behind older files, which can be confusing. Use -f to force overwriting.
  • Delayed Compression: Especially on larger files, compression can be slower compared to other tools like gzip; plan operations accordingly.

Integration

bzip2 becomes extremely powerful when combined with other UNIX tools. For instance, to backup and compress a directory:

tar -cvf - /path/to/directory | bzip2 > backup.bz2

This uses tar to bundle the directory and pipes it directly through bzip2 for compression.

  • gzip: Another popular compression tool focused on speed.
  • xz: Compression tool using LZMA/LZMA2 compression algorithms, generally better for binary files.
  • tar: Often used with bzip2 for creating compressed archives of directories.

Explore more in the official bzip2 documentation.