cut - Linux
Overview
The cut
command in Linux is used to remove or “cut out” sections of each line in a file or input provided through pipes. It processes text (such as strings of characters and numbers) based on delimiters such as tabs and spaces, or by character position. This command is particularly useful for extracting columns of data from text files or command outputs, making it a valuable tool for data processing and scripting.
Syntax
The basic syntax of the cut
command is:
cut OPTION... [FILE]...
Where OPTION
includes ways to specify which parts of each line to output and FILE
names one or more files to process. If no file is specified, or if the file name is -
, the standard input is used.
Options/Flags
Here are some commonly used options in the cut
command:
-b, --bytes=LIST
: Cut based on list of byte positions. For example,-b 1-5
extracts the first five bytes from each line.-c, --characters=LIST
: Select only these characters, similar to bytes but counts multibyte characters.-d, --delimiter=DELIM
: Use theDELIM
character instead of TAB as the field delimiter.-f, --fields=LIST
: Select these fields only; uses the delimiter to determine fields.--complement
: Inverts the selection set by-b
,-c
, or-f
.-s, --only-delimited
: Do not print lines not containing delimiters.--output-delimiter=STRING
: UseSTRING
as the output delimiter instead of the input delimiter.
Examples
-
Extract the first column from a file:
cut -d',' -f1 data.csv
-
Extract multiple fields from a file:
cut -d':' -f1,3,6 /etc/passwd
-
Cut characters from position 3 to 5:
cut -c3-5 details.txt
-
Exclude specific fields:
cut -d' ' --complement -s -f2 inventory.txt
Common Issues
- No fields error: This occurs when the specified delimiter is absent in the input. Use
--output-delimiter
to specify fields clearly or check the input file format. - Multibyte character handling: When dealing with multibyte characters, prefer
-c
over-b
to ensure characters are correctly interpreted.
Integration
cut
can be used with other commands to manipulate and analyze text data effectively:
ps aux | cut -d' ' -f1 | sort | uniq -c
This pipeline lists the number of processes each user is running on a system by cutting the first field (user) from ps
output, sorting it, and then counting unique entries.
Related Commands
awk
: Offers more complex text manipulation capabilities.sed
: Useful for editing lines in text streaming.grep
: Used to search for text in a file or output.
Additional information can be found in the official GNU documentation: GNU Coreutils – Cut
This concise and comprehensive guide presents the cut
command’s operation, demonstrating its versatility in text manipulation tasks essential for many Linux users especially those engaged in scripting and data analysis.