Site Administration Page 2 - Dealing with Files and Filesystems |
You may not know where its odd name originated, but you can't argue the usefulness of grep. Have you ever needed to find a particular file and thought, I don't recall the filename, but I remember some of its contents? The oddly namedgrepcommand does just that, searching inside files and reporting on those that contain a given piece of text. Finding Text Suppose you wish to search your shell scripts for the text$USER. Try this: % grep -s '$USER' * In this example,grephas searched through all files in the current directory, displaying each line that contained the text$USER. Use single quotes around the text to prevent the shell from interpreting special characters. The-soption suppresses error messages whengrepencounters a directory. Perhaps you only want to know the name of each file containing the text$USER. Use the-loption to create that list for you: % grep -ls '$USER' *add-user bu-user del-user mount-host mount-user Searching by Relevance What if you're more concerned about how many times a particular string occurs within a file? Thats known as a relevance search. Use a command similar to: % grep -sc '$USER' * | grep -v ':0' | sort -k 2 -t : -r How does this magic work? The-cflag lists each file with a count of matching lines, but it unfortunately includes files with zero matches. To counter this, I piped the output fromgrepinto a secondgrep, this time searching for ':0'and using a second option, For a little extra flair, I sorted the subsequent output by the second field of each line withsort -k 2, assuming a field separator of colon (-t :) and using Document Extracts Suppose you wish to search a set of documents and extract a few lines of text centered on each occurrence of a keyword. This time we are interested in the matching lines and their surrounding context, but not in the filenames. Use a command something like this: % grep -rhiw -A4 -B4 'preferences' *.txt > research.txt This grep command searches all files with the .txt extension for the word preferences. It performs a recursive search (-r) to include all subdirectories, hides (-h) the filename in the output, matches in a case-insensitive (-i) manner, and matchespreferencesas a complete word but not as part of another word (-w). The-A4and-B4options display the four lines immediatelyafter andbefore the matched line, to give the desired context. Finally, I've redirected the output to the file research.txt. You could also send the output straight to thevimtext editor with: % grep -rhiw -A4 -B4 'preferences' *.txt | vim -
Specifyingvim -tellsvim to read stdin (in this case the piped output fromgrep) instead of a file. Type:q!to exitvim. To search files for several alternatives, use the-eoption to introduce extra search patterns: % grep -e 'text1' -e 'text2' *
Using Regular Expressions To search for text that is more vaguely specified, use a regular expression.grepunderstands both basic and extended regular expressions, though it must be invoked as eitheregrep orgrep -Ewhen given an extended regular expression. The text or regular expression to be matched is usually called the pattern. Suppose you need to search for lines that end in a space or tab character. Try this command (to insert a tab, press Ctrl-V and then Ctrl-I, shown as<tab>in the example): % grep -n '[ <tab>]$' test-file I used the[...]construct to form a regular expression listing the characters to match: space and tab. The expression matches exactly one space or one tab character.$anchors the match to the end of a line. The-nflag tellsgrepto include the line number in its output. Alternatively, use: % grep -n '[[:blank:]]$' test-file Regular expressions provide many preformed character groups of the form[[:description:]]. Example groups include all control characters, all digits, or all alphanumeric characters. Seeman re_formatfor details. We can modify a previous example to search for either preferences or preference as a complete word, using an extended regular expression such as this: % egrep -rhiw -A4 -B4 'preferences?' *.txt > research.txt The?symbol specifies zero or one of the preceding character, making thesofpreferencesoptional. Note that I useegrepbecause?is available only in extended regular expressions. If you wish to search for the?character itself, escape it with a backslash, as in\?. An alternative method uses an expression of the form(string1|string2), which matches either one string or the other: % egrep -rhiw -A4 -B4 'preference(s|)' *.txt > research.txt As a final example, use this to seek out allbash,tcsh, orshshell scripts: % egrep '^#\!/bin/(ba|tc|)sh[[:blank:]]*$' * The caret (^) character at the start of a regular expression anchors it to the start of the line (much as$at the end anchors it to the end).(ba|tc|)matches ba, tc, or nothing. The*character specifies zero or more of[[:blank:]], allowing trailing whitespace but nothing else. Note that the!character must be escaped as\!to avoid shell interpretation intcsh(but not inbash).
Combining grep with Other Commands grep works well with other commands. For example, to display alltcshprocesses: % ps axww | grep -w 'tcsh' Notice that thegrepcommand itself appears in the output. To prevent this, use: % ps axww | grep -w '[t]csh' I'll let you figure out how this works. See Also
blog comments powered by Disqus |
|
|
|
|
|
|
|