Key Takeaways
- The Linux grep command is a useful tool for string and pattern matching, allowing you to search through text files using various options.
- With grep, you can perform simple searches, recursive searches, search for whole words, use multiple search terms, count matches, add context, and even pipe the output to other commands for further manipulation.
The Linux grep
command is a string and pattern matching utility that displays matching lines from multiple files. It also works with piped output from other commands. We show you how.
The Grep Command in Linux
The grep
command is famous in Linux and Unix circles for three reasons. Firstly, it is tremendously useful. Secondly, the wealth of options can be overwhelming. Thirdly, it was written overnight to satisfy a particular need. The first two are bang on; the third is slightly off.
Ken Thompson had extracted the regular expression search capabilities from the ed
editor (pronounced ee-dee) and created a little program — for his own use — to search through text files. His department head at Bell Labs, Doug Mcilroy, approached Thompson and described the problem one of his colleagues, Lee McMahon, was facing.
McMahon was trying to identify the authors of the Federalist papers through textual analysis. He needed a tool that could search for phrases and strings within text files. Thompson spent about an hour that evening making his tool a general utility that could be used by others and renamed it as grep
. He took the name from the ed
command string g/re/p
, which translates as “global regular expression search.”
You can watch Thompson talking to Brian Kernighan about the birth of grep
.
Simple Searches with the grep Command
To search for a string within a file, pass the search term and the file name on the command line:
Matching lines are displayed. In this case, it is a single line. The matching text is highlighted. This is because on most distributions grep
is aliased to:
alias grep='grep --colour=auto'
Let’s look at results where there are multiple lines that match. We’ll look for the word “Average” in an application log file. Because we can’t recall if the word is in lowercase in the log file, we’ll use the -i
(ignore case) option:
grep -i Average geek-1.log
Every matching line is displayed, with the matching text highlighted in each one.
We can display the non-matching lines by using the -v (invert match) option.
grep -v Mem geek-1.log
There is no highlighting because these are the non-matching lines.
We can cause grep
to be completely silent. The result is passed to the shell as a return value from grep
. A result of zero means the string was found, and a result of one means it was not found. We can check the return code using the $?
special parameters:
grep -q average geek-1.log
echo $?
grep -q howtogeek geek-1.log
echo $?
Recursive Searches Using grep
To search through nested directories and subdirectories, use the -r (recursive) option. Note that you don’t provide a file name on the command line, you must provide a path. Here we’re searching in the current directory “.” and any subdirectories:
grep -r -i memfree .
The output includes the directory and filename of each matching line.
We can make grep
follow symbolic links by using the -R
(recursive dereference) option. We’ve got a symbolic link in this directory, called logs-folder
. It points to /home/dave/logs
.
ls -l logs-folder
Let’s repeat our last search with the -R
(recursive dereference) option:
grep -R -i memfree .
The symbolic link is followed and the directory it points to is searched by grep
too.
Searching for Whole Words with the grep Command
By default, grep
will match a line if the search target appears anywhere in that line, including inside another string. Look at this example. We’re going to search for the word “free.”
grep -i free geek-1.log
The results are lines that have the string “free” in them, but they’re not separate words. They’re part of the string “MemFree.”
To force grep
to match separate “words” only, use the -w
(word regexp) option.
grep -w -i free geek-1.log
echo $?
This time there are no results because the search term “free” does not appear in the file as a separate word.
Using Multiple Search Terms
The -E
(extended regexp) option allows you to search for multiple words. (The -E
option replaces the deprecated egrep
version of grep
.)
This command searches for two search terms, “average” and “memfree.”
grep -E -w -i "average|memfree" geek-1.log
All of the matching lines are displayed for each of the search terms.
You can also search for multiple terms that are not necessarily whole words, but they can be whole words too.
The -e
(patterns) option allows you to use multiple search terms on the command line. We’re making use of the regular expression bracket feature to create a search pattern. It tells grep
to match any one of the characters contained within the brackets “[].” This means grep
will match either “kB” or “KB” as it searches.
Both strings are matched, and, in fact, some lines contain both strings.
Matching Lines Exactly
The -x
(line regexp) will only match lines where the entire line matches the search term. Let’s search for a date and time stamp that we know appears only once in the log file:
grep -x "20-Jan--06 15:24:35" geek-1.log
The single line that matches is found and displayed.
The opposite of that is only showing the lines that don’t match. This can be useful when you’re looking at configuration files. Comments are great, but sometimes it’s hard to spot the actual settings in amongst them all. Here’s the /etc/sudoers
file:
We can effectively filter out the comment lines like this:
sudo grep -v "#" /etc/sudoers
That’s much easier to parse.
Only Displaying Matching Text
There may be an occasion when you don’t want to see the entire matching line, just the matching text. The -o
(only matching) option does just that.
grep -o MemFree geek-1.log
The display is reduced to showing only the text that matches the search term, instead of the entire matching line.
Counting With grep
grep
isn’t just about text, it can provide numerical information too. We can make grep
count for us in different ways. If we want to know how many times a search term appears in a file, we can use the -c
(count) option.
grep -c average geek-1.log
grep
reports that the search term appears 240 times in this file.
You can make grep
display the line number for each matching line by using the -n
(line number) option.
grep -n Jan geek-1.log
The line number for each matching line is displayed at the start of the line.
To reduce the number of results that are displayed, use the -m
(max count) option. We’re going to limit the output to five matching lines:
grep -m5 -n Jan geek-1.log
Adding Context with grep
Being able to see some additional lines — possibly non-matching lines —for each matching line is often useful. it can help distinguish which of the matched lines are the ones you are interested in.
To show some lines after the matching line, use the -A (after context) option. We’re asking for three lines in this example:
grep -A 3 -x "20-Jan-06 15:24:35" geek-1.log
To see some lines from before the matching line, use the -B
(context before) option.
grep -B 3 -x "20-Jan-06 15:24:35" geek-1.log
And to include lines from before and after the matching line use the -C
(context) option.
grep -C 3 -x "20-Jan-06 15:24:35" geek-1.log
Showing Matching Files
To see the names of the files that contain the search term, use the -l
(files with match) option. To find out which C source code files contain references to the sl.h
header file, use this command:
grep -l "sl.h" *.c
The file names are listed, not the matching lines.
And of course, we can look for files that don’t contain the search term. The -L
(files without match) option does just that.
grep -L "sl.h" *.c
Start and End of Lines
We can force grep
to only display matches that are either at the start or the end of a line. The “^” regular expression operator matches the start of a line. Practically all of the lines within the log file will contain spaces, but we’re going to search for lines that have a space as their first character:
grep "^ " geek-1.log
The lines that have a space as the first character — at the start of the line — are displayed.
To match the end of the line, use the “$” regular expression operator. We’re going to search for lines that end with “00.”
grep "00$" geek-1.log
The display shows the lines that have “00” as their final characters.
Using Pipes with grep
Of course, you can pipe input to grep
, pipe the output from grep
into another program, and have grep
nestled in the middle of a pipe chain.
Let’s say we want to see all occurrences of the string “ExtractParameters” in our C source code files. We know there’s going to be quite a few, so we pipe the output into less
:
grep "ExtractParameters" *.c | less
The output is presented in less
.
This lets you page through the file listing and to use less's
search facility.
If we pipe the output from grep
into wc
and use the -l
(lines) option, we can count the number of lines in the source code files that contain “ExtractParameters”. (We could achieve this using the grep
-c
(count) option, but this is a neat way to demonstrate piping out of grep
.)
grep "ExtractParameters" *.c | wc -l
With the next command, we’re piping the output from ls
into grep
and piping the output from grep
into sort
. We’re listing the files in the current directory, selecting those with the string “Aug” in them, and sorting them by file size:
ls -l | grep "Aug" | sort +4n
Let’s break that down:
- ls -l: Perform a long format listing of the files using
ls
. - grep “Aug”: Select the lines from the
ls
listing that have “Aug” in them. Note that this would also find files that have “Aug” in their names. - sort +4n: Sort the output from grep on the fourth column (filesize).
We get a sorted listing of all the files modified in August (regardless of year), in ascending order of file size.
grep in Linux: Less a Command, More of an Ally
grep
is a terrific tool to have at your disposal. It dates from 1974 and is still going strong because we need what it does, and nothing does it better.
Coupling grep
with some regular expressions-fu really takes it to the next level.