Home arrow Site Administration arrow Page 2 - Dealing with Files and Filesystems

HACK#14: Get the Most Out of grep - Administration

In this first of a two-part article, you will learn how to get the most out of certain BSD commands, as well as some useful ways to handle your filesystem. It is excerpted from chapter two of the book BSD Hacks, written by Dru Lavigne (O'Reilly, 2005; ISBN: 0596006799). Copyright 2005 O'Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O'Reilly Media.

TABLE OF CONTENTS:
  1. Dealing with Files and Filesystems
  2. HACK#14: Get the Most Out of grep
  3. HACK#15: Manipulate Files with sed
  4. HACK#16: Format Text at the Command Line
  5. HACK#17: Delimiter Dilemma
By: O'Reilly Media
Rating: starstarstarstarstar / 5
December 28, 2006

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

You may not know where its odd name originated, but you can't argue the usefulness of grep.

Have you ever needed to find a particular file and thought, I don't recall the filename, but I remember some of its contents? The oddly namedgrepcommand does just that, searching inside files and reporting on those that contain a given piece of text.

Finding Text

Suppose you wish to search your shell scripts for the text$USER. Try this:

  % grep -s '$USER' *
 
add-user:if [ "$USER" != "root" ]; then 
  bu-user: echo " [-u user] - override $USER as the user to backup"
  bu-user:if [ "$user" = "" ]; then user="$USER"; fi
  del-user:if [ "$USER" != "root" ]; then 
  mount-host:mounted=$(df | grep "$ALM_AFP_MOUNT/$USER")
  .....
 
mount-user: echo " [-u user] - override $USER as the user to backup"
 
mount-user:if [ "$user" = "" ]; then user="$USER"; fi

In this example,grephas searched through all files in the current directory, displaying each line that contained the text$USER. Use single quotes around the text to prevent the shell from interpreting special characters. The-soption suppresses error messages whengrepencounters a directory.

Perhaps you only want to know the name of each file containing the text$USER. Use the-loption to create that list for you:

  % grep -ls '$USER' *
 
add-user
  bu-user
 
del-user
 
mount-host
 
mount-user

Searching by Relevance

What if you're more concerned about how many times a particular string occurs within a file? Thats known as a relevance search. Use a command similar to:

  % grep -sc '$USER' * | grep -v ':0' | sort -k 2 -t : -r
  mount-host:6
  mount-user:2
  bu-user:2
  del-user:1
  add-user:1

How does this magic work? The-cflag lists each file with a count of matching lines, but it unfortunately includes files with zero matches. To counter this, I piped the output fromgrepinto a secondgrep, this time searching for ':0'and using a second option,
-v, to reverse the sense of the search by displaying lines that don't match. The secondgrepreads from the pipe instead of a file, searching the output of the firstgrep.

For a little extra flair, I sorted the subsequent output by the second field of each line withsort -k 2, assuming a field separator of colon (-t :) and using
-rto reverse the sort into descending order.

Document Extracts

Suppose you wish to search a set of documents and extract a few lines of text centered on each occurrence of a keyword. This time we are interested in the matching lines and their surrounding context, but not in the filenames. Use a command something like this:

  % grep -rhiw -A4 -B4 'preferences' *.txt > research.txt
  %
more research.txt

This grep command searches all files with the .txt extension for the word preferences. It performs a recursive search (-r) to include all subdirectories, hides (-h) the filename in the output, matches in a case-insensitive (-i) manner, and matchespreferencesas a complete word but not as part of another word (-w). The-A4and-B4options display the four lines immediatelyafter andbefore the matched line, to give the desired context. Finally, I've redirected the output to the file research.txt.

You could also send the output straight to thevimtext editor with:

  % grep -rhiw -A4 -B4 'preferences' *.txt | vim -
 
Vim: Reading from stdin...

vim can be installed from /usr/ports/editors/vim.

Specifyingvim -tellsvim to read stdin (in this case the piped output fromgrep) instead of a file. Type:q!to exitvim.

To search files for several alternatives, use the-eoption to introduce extra search patterns:

  % grep -e 'text1' -e 'text2' *

Q. How did grep get its odd name?

A.grep was written as a standalone program to simulate a commonly performed command available in the ancient Unix editorex. The command in question searched an entire file for lines containing a regular expression and displayed those lines. The command wasg/re/p:globally search for aregularexpression andprint the line.

Using Regular Expressions

To search for text that is more vaguely specified, use a regular expression.grepunderstands both basic and extended regular expressions, though it must be invoked as eitheregrep orgrep -Ewhen given an extended regular expression. The text or regular expression to be matched is usually called the pattern.

Suppose you need to search for lines that end in a space or tab character. Try this command (to insert a tab, press Ctrl-V and then Ctrl-I, shown as<tab>in the example):

  % grep -n '[ <tab>]$' test-file
 
2:ends in space
  3:ends in tab

I used the[...]construct to form a regular expression listing the characters to match: space and tab. The expression matches exactly one space or one tab character.$anchors the match to the end of a line. The-nflag tellsgrepto include the line number in its output.

Alternatively, use:

  % grep -n '[[:blank:]]$' test-file  
  2:ends is space
  3:ends in tab

Regular expressions provide many preformed character groups of the form[[:description:]]. Example groups include all control characters, all digits, or all alphanumeric characters. Seeman re_formatfor details.

We can modify a previous example to search for either preferences or preference as a complete word, using an extended regular expression such as this:

  % egrep -rhiw -A4 -B4 'preferences?' *.txt > research.txt

The?symbol specifies zero or one of the preceding character, making thesofpreferencesoptional. Note that I useegrepbecause?is available only in extended regular expressions. If you wish to search for the?character itself, escape it with a backslash, as in\?.

An alternative method uses an expression of the form(string1|string2), which matches either one string or the other:

  % egrep -rhiw -A4 -B4 'preference(s|)' *.txt > research.txt

As a final example, use this to seek out allbash,tcsh, orshshell scripts:

  % egrep '^#\!/bin/(ba|tc|)sh[[:blank:]]*$' *

The caret (^) character at the start of a regular expression anchors it to the start of the line (much as$at the end anchors it to the end).(ba|tc|)matches ba, tc, or nothing. The*character specifies zero or more of[[:blank:]], allowing trailing whitespace but nothing else. Note that the!character must be escaped as\!to avoid shell interpretation intcsh(but not inbash).

Heres a handy tip for debugging regular expressions: if you dont pass a filename to grep, it will read standard input, allowing you to enter lines of text to see which match. grep will echo back only matching lines.

Combining grep with Other Commands

grep works well with other commands. For example, to display alltcshprocesses:

  % ps axww | grep -w 'tcsh'
 
saruman 10329 0.0 0.2 6416 1196 p1 Ss Sat01PM 0:00.68 -tcsh (tcsh)
 
saruman 11351 0.0 0.2 6416 1300 std Ss Sat07PM 0:02.54 -tcsh (tcsh)
  saruman 13360 0.0 0.0 1116    4 std R+ 10:57PM 0:00.00 grep -w tcsh
  %

Notice that thegrepcommand itself appears in the output. To prevent this, use:

  % ps axww | grep -w '[t]csh'
  saruman 10329 0.0 0.2 6416 1196 p1 Ss Sat01PM 0:00.68 -tcsh (tcsh)
  saruman 11351 0.0 0.2 6416 1300 std Ss Sat07PM 0:02.54 -tcsh (tcsh)
  %

I'll let you figure out how this works.

See Also

  • man grep 
  • man re_format(regular expressions)


 
 
>>> More Site Administration Articles          >>> More By O'Reilly Media
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

SITE ADMINISTRATION ARTICLES

- Coding: Not Just for Developers
- To Support or Not Support IE?
- Administration: Networking OSX and Win 7
- DotNetNuke Gets Social
- Integrating MailChimp with Joomla: Creating ...
- Integrating MailChimp with Joomla: List Mana...
- Integrating MailChimp with Joomla: Building ...
- Integrating MailChimp with Joomla
- More Top WordPress Plugins for Social Media
- Optimizing Security: SSH Public Key Authenti...
- Patches and Rejects in Software Configuratio...
- Configuring a CVS Server
- Managing Code and Teams for Cross-Platform S...
- Software Configuration Management
- Back Up a Joomla Site with Akeeba Backup

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: