uniq - macOS

Overview

uniq is a versatile command for removing or counting duplicate lines in a file, enhancing data quality and simplifying text processing tasks. It’s commonly used in data analysis, text editing, and system administration.

Syntax

uniq [options] [file]

Options/Flags

-c, –count: Count occurrences of each unique line.
-d, –repeated: Only print duplicate lines.
-i, –ignore-case: Ignore case when comparing lines.
-u, –unique: Only print unique lines.
-z, –zero-terminated: Treat input lines as zero-terminated instead of newline-terminated.

Examples

Count duplicate lines:

uniq -c file.txt

Print only duplicate lines:

uniq -d file.txt

Print unique lines (with counts):

uniq -c file.txt | grep 1

Ignore case when comparing lines:

uniq -i file.txt

Common Issues

Improper file handling: Ensure the specified file exists and has correct permissions.
Incorrect options: Refer to the available options above and use them appropriately.
Large files: Processing very large files may take a long time. Consider using a streaming approach to avoid memory issues.

Integration

Combine with grep: Filter input before removing duplicates:

grep pattern file.txt | uniq

Pipe to xargs: Execute commands on each unique line:

uniq file.txt | xargs command

sort: Sort text input before removing duplicates.
sed: Perform more complex text manipulations.
comm: Compare two sorted files and find unique lines.
GNU uniq manual for additional documentation.