Unix Commands for Large Files
This tutorial is about how to handle large files using Linux/Unix commands, such as viewing the first few lines, counting the number of lines or bytes, splitting the file, etc.
wc — word, line, character, and byte count
The wc
utility displays the number of lines, words, and bytes in the file. It has the following options.
-
-c
: The number of bytes. -
-l
: The number of lines. -
-w
: The number of words.
The following are some examples:
-
wc -c myfile.txt
: counts the number of bytes. -
wc -l myfile.csv
: counts the number of lines.
head — display first lines of a file
The head
utility displays the first few lines or bytes of a file. A similar command tail
displays the last lines. It has the following options.
-
-n count
: display the first count lines. -
-c count
: display the first count bytes.
The following are some examples:
-
head -n 10 myfile.csv
: displays the first 10 lines.
split — split a file into pieces
The split
utility breaks the given file up into files of 1000 lines each. It has the following options.
-
-l count
: split into files of count lines each. -
-b count[k|m]
: split into files of count [kilo|mega] bytes.
By default, the file is split into lexically ordered files named with the prefix “x”. The following are some examples:
-
split -l 10000 myfile.csv
: splits the file into 10000 lines each, with names ‘xaa’, ‘xab’, etc -
split -l 100 myfile.csv newfile
: splits the file into 100 lines each, with names ‘newfileaa’, ‘newfileab’, etc