uniq - Linux
Overview
The uniq
command in Linux is a utility for filtering adjacent matching lines from input, typically used with sorted data. It helps in identifying and/or removing duplicates, counting occurrences, and other similar tasks. This command is particularly effective in processing log files, data cleanup, and analysis tasks where unique entries need to be identified.
Syntax
The basic syntax of the uniq
command is:
uniq [OPTIONS] [INPUT [OUTPUT]]
- INPUT is the name of the input file. If no input file is specified,
uniq
reads from the standard input. - OUTPUT is the name of the output file. If no output file is specified,
uniq
writes to the standard output.
Options/Flags
-c
,--count
: Prefix lines by the number of occurrences.-d
,--repeated
: Only print duplicate lines, one for each group of identical lines.-i
,--ignore-case
: Ignore differences in case when comparing lines.-u
,--unique
: Only print unique lines.-z
,--zero-terminated
: End lines with a zero byte (ASCII NUL), instead of the usual newline.-f N
,--skip-fields=N
: Skip the first N fields in each line before checking for uniqueness (a field is a string of non-blank characters separated by blanks).-s N
,--skip-chars=N
: Skip the first N characters in each line before checking for uniqueness.-w N
,--check-chars=N
: Compare only the first N characters in lines.
Examples
-
Basic Usage: Remove duplicate lines from a file.
uniq myfile.txt
-
Count Occurrences: Count how many times each line appears in a file.
uniq -c sorted_file.txt
-
Find Duplicates: Print only the lines that repeat in a file.
uniq -d sorted_file.txt
-
Case Insensitive Comparison:
uniq -i unsorted_file.txt
-
Print Only Unique Lines:
uniq -u sorted_file.txt
Common Issues
-
Non-adjacent Duplicates:
uniq
only removes duplicates that are adjacent. To handle non-adjacent duplicates, the input should be sorted withsort
before usinguniq
.
Example:sort myfile.txt | uniq
-
Case Sensitivity: By default,
uniq
is case-sensitive. Use the-i
option to ignore case.
Integration
-
Sort, Unique, and Count: Chain
sort
anduniq
to count unique lines in a file:sort file.txt | uniq -c
-
Piping with grep:
Combineuniq
withgrep
to find unique error logs:grep "Error" log.txt | sort | uniq
Related Commands
- sort: Often used in conjunction with
uniq
to sort data before uniqueness operations. - awk and sed: Useful for more advanced text manipulation tasks.
Visit the official GNU documentation for uniq
here for more detailed information and advanced usage scenarios.