comm - macOS


Overview

The comm command in macOS is used to compare two sorted files line by line. It outputs three text columns: lines unique to the first file, lines unique to the second file, and lines common to both files. This command is especially useful in the fields of data analysis and system administration where it’s essential to quickly identify differences or similarities between files.

Syntax

The basic syntax of the comm command is as follows:

comm [options] file1 file2

Both file1 and file2 are required arguments and should be sorted. If either file is ‘-‘, comm reads from standard input.

Options/Flags

  • -1: Suppress the output of the first column (lines unique to file1).
  • -2: Suppress the output of the second column (lines unique to file2).
  • -3: Suppress the output of the third column (lines common to both files).

Using these flags in various combinations allows you to isolate the specific differences or similarities you are interested in.

Examples

  1. Basic Difference Comparison:

    comm file1.txt file2.txt
    

    This command will display three columns: unique to file1.txt, unique to file2.txt, and shared lines.

  2. Find Unique Lines in First File:

    comm -23 file1.txt file2.txt
    

    This shows lines that are unique to file1.txt.

  3. Compare Files Ignoring Common Lines:

    comm -12 file1.txt file2.txt
    

    Outputs only the lines that are unique to each file, with no shared lines.

Common Issues

  • Unsorted Files: comm requires input files to be sorted. If not, results may be unpredictable. Use sort before using comm.

    sort unsorted1.txt > sorted1.txt
    sort unsorted2.txt > sorted2.txt
    comm sorted1.txt sorted2.txt
    
  • Locale Settings Affect Sorting: Sorting can vary by locale, which affects the comparison. Ensure consistent locale settings by using export LC_ALL=C before sorting and comparing.

Integration

comm can be integrated with other commands for powerful scripting solutions. For example, comparing outputs of different commands directly:

command1 | sort > output1.txt
command2 | sort > output2.txt
comm output1.txt output2.txt

This technique can be used to monitor changes in system configuration or data states over time.

  • diff: Shows differences between files line by line.
  • sort: Sorts lines of text files.
  • uniq: Reports or omits repeated lines in a file.

For more details, consult the official macOS documentation pages or accessible resources such as the man pages (man comm).