Unix Commands for Large Files

This tutorial is about how to handle large files using Linux/Unix commands, such as viewing the first few lines, counting the number of lines or bytes, splitting the file, etc.

wc — word, line, character, and byte count

The wc utility displays the number of lines, words, and bytes in the file. It has the following options.

  • -c: The number of bytes.
  • -l: The number of lines.
  • -w: The number of words.

The following are some examples:

  • wc -c myfile.txt: counts the number of bytes.
  • wc -l myfile.csv: counts the number of lines.

head — display first lines of a file

The head utility displays the first few lines or bytes of a file. A similar command tail displays the last lines. It has the following options.

  • -n count: display the first count lines.
  • -c count: display the first count bytes.

The following are some examples:

  • head -n 10 myfile.csv: displays the first 10 lines.

split — split a file into pieces

The split utility breaks the given file up into files of 1000 lines each. It has the following options.

  • -l count: split into files of count lines each.
  • -b count[k|m]: split into files of count [kilo|mega] bytes.

By default, the file is split into lexically ordered files named with the prefix “x”. The following are some examples:

  • split -l 10000 myfile.csv: splits the file into 10000 lines each, with names ‘xaa’, ‘xab’, etc
  • split -l 100 myfile.csv newfile: splits the file into 100 lines each, with names ‘newfileaa’, ‘newfileab’, etc

Comments

comments