pv - Linux


Overview

pv (Pipe Viewer) is a terminal-based tool used in Unix and Unix-like operating systems to monitor the progress of data through a pipeline. It provides a visual display of the following parameters: how much data has passed through, the current throughput speed, the time elapsed, and an ETA for completion. pv is highly useful in situations where you need to trace the progress of data from a source to a destination, particularly helpful when dealing with large files or streams in a shell script.

Syntax

The basic syntax for the pv command is:

pv [OPTIONS] [FILE]...
  • [OPTIONS]: A set of options to modify behavior of pv.
  • [FILE…]: One or more files to be processed. If no file is specified, pv will read from standard input.

Options/Flags

  • -p, --progress: Show progress bar.
  • -t, --timer: Show time elapsed.
  • -e, --eta: Show estimated time of arrival (completion).
  • -r, --rate: Show data transfer rate in bytes per second.
  • -a, --average-rate: Show average rate.
  • -b, --bytes: Show number of bytes transferred.
  • -s <size>, --size <size>: Set the total size of the data stream for proper calculation of progress and ETA. Size is in bytes.
  • -n, --numeric: Output percentages, not a visual representation.
  • -q, --quiet: Reduce output to errors only.
  • -c, --cursor: Use cursor positioning escape sequences instead of just using carriage returns.

Examples

  1. Simple Progress Monitoring: Display the progress of copying a large file:
    pv largefile.iso > copy_of_largefile.iso
    
  2. Monitoring Progress of a zipped file:
    pv file.zip | unzip > output_folder
    
  3. Using pv in combination with dd to show progress while making a disk backup:
    dd if=/dev/sda | pv -s 500G | dd of=/dev/sdb
    

Common Issues

  • Misestimation of progress: If the -s (size) option is not accurately set, pv can misreport the progress and ETA. Always try to specify the correct size of the data.
  • Output clutter: Without the proper flags, pv can cause output clutter, particularly when integrated in scripts. Use -c or -q to manage output effectively.

Integration

pv is often used in backup scripts or large data migrations, particularly to provide visibility into operations usually opaque when using commands like dd, tar, or even custom ssh commands. Here’s an example of using pv within a backup script:

#!/bin/bash
tar cf - /path/to/directory | pv -s $(du -sb /path/to/directory | awk '{print $1}') | gzip > backup.gz

In this script, pv provides a progress bar for the tar operation, using the size calculated by du.

  • dd: Used for converting and copying files.
  • tar: Used for archiving multiple files into one.
  • zip/unzip: Used for compression and decompression.

For further information, check out the man pv page or the official documentation.