uniq - macOS
Overview
uniq is a versatile command for removing or counting duplicate lines in a file, enhancing data quality and simplifying text processing tasks. It’s commonly used in data analysis, text editing, and system administration.
Syntax
uniq [options] [file]
Options/Flags
- -c, –count: Count occurrences of each unique line.
- -d, –repeated: Only print duplicate lines.
- -i, –ignore-case: Ignore case when comparing lines.
- -u, –unique: Only print unique lines.
- -z, –zero-terminated: Treat input lines as zero-terminated instead of newline-terminated.
Examples
Count duplicate lines:
uniq -c file.txt
Print only duplicate lines:
uniq -d file.txt
Print unique lines (with counts):
uniq -c file.txt | grep 1
Ignore case when comparing lines:
uniq -i file.txt
Common Issues
- Improper file handling: Ensure the specified file exists and has correct permissions.
- Incorrect options: Refer to the available options above and use them appropriately.
- Large files: Processing very large files may take a long time. Consider using a streaming approach to avoid memory issues.
Integration
Combine with grep: Filter input before removing duplicates:
grep pattern file.txt | uniq
Pipe to xargs: Execute commands on each unique line:
uniq file.txt | xargs command
Related Commands
- sort: Sort text input before removing duplicates.
- sed: Perform more complex text manipulations.
- comm: Compare two sorted files and find unique lines.
- GNU uniq manual for additional documentation.