sort - Linux
Overview
The sort
command in Linux is a utility for sorting lines of text in a file. It supports sorting alphabetically, numerically, and even does month-wise sorting. This command is useful for organizing data, generating readable outputs, debugging datasets to ensure they are in a specified order, or preparing data for further processing.
Syntax
The basic syntax of the sort
command is as follows:
sort [OPTION]... [FILE]...
If no file is specified, or if the file is “-“, sort
reads from the standard input.
Options/Flags
-b
,--ignore-leading-blanks
: Ignore leading blanks.-d
,--dictionary-order
: Only consider blanks and alphanumeric characters.-f
,--ignore-case
: Fold lower case to upper case characters for sorting.-n
,--numeric-sort
: Compare according to string numerical value.-r
,--reverse
: Reverse the result of comparisons.-k
,--key=KEYDEF
: Sort via a key; KEYDEF gives location and type.-m
,--merge
: Merge already sorted files; do not sort.-o
,--output=FILE
: Write result to FILE instead of standard output.-t
,--field-separator=SEP
: Use SEP instead of non-blank to blank transition.-u
,--unique
: Suppress all but one of successive identical lines.-c
,--check
,--check=diagnose-first
: Check for sorted input; do not sort.--help
: Display a help message and exit.--version
: Output version information and exit.
Examples
- Simple Sort:
sort file.txt
- Numeric Sort:
sort -n file.txt
- Reverse Order Sort:
sort -r file.txt
- Sort and Save Output:
sort file.txt -o sorted_file.txt
- Sort on a Specific Key (field):
sort -k2,2 file.txt
- Dictionary Order and Unique Lines:
sort -d -u file.txt
Common Issues
- Locale-specific sorting issues: Sorting might vary with locales. Use
LC_ALL=C sort file.txt
for consistent results. - Memory limits on large files: Consider using
--batch-size
or splitting the file to sort and then merging. - Performance issues with large datasets: Use
-S
or--buffer-size
to optimize memory usage.
Integration
The sort
command can be integrated into pipelines for complex data processing:
cat file.txt | sort | uniq -c
Here, sort
is piped with uniq -c
to count unique lines post sorting. It’s often used before awk
or sed
for further processing.
Related Commands
- uniq: Often paired with
sort
for removing duplicates. - awk: For data extraction and reporting, after sorting.
- sed: For stream editing after sorting data.
For more details, refer to the official documentation or type man sort
in your terminal.