We are searching the word phrase in the file rfc2119.txt
tr will split the space in newline and grep will search for the word in be beginning or end of the line and does not have any alphanumeric character around the word.
File: http://www.faqs.org/rfcs/rfc2119.txt
cat rfc2119.txt |tr '[[:space:]]' '\n' |grep -E "(^|\W)phrase(\W|$)" |wc -l
o/p: 5
Need to clean up the punctuation...
ReplyDeleteand get the words each occurred how many time..
cat rfc2119.txt | tr '[[:punct:]]' ' ' |tr '[[:space:]]' '\n'|grep -v "^$"|sort |uniq -c|sort -n