----- Original Message -----
From: "Kevin Fleming" <kevintickle@gmail.com>
Newsgroups: comp.unix.sco.misc
To: <distro@jpr.com>
Sent: Thursday, January 26, 2006 5:08 PM
Subject: help with grep looking for cats and dogs
> Hey Everybody,
>
> I've been searching high and low for how to grep for two different
> strings at once, and I'm not sure that it can be done. Ideas?
> I've got a bunch of files and I'm searching for ones that have both the
> word "cat" and "dog" in them, not necessarily on the same line.
> I've tried using something like this:
>
> find /tmp -exec grep -q -E "cat" {} \; -print
>
> but I can't seem to do both cat and dog at the same time.
>
> find /tmp -exec grep -q -E "cat.\dog" {} \; -print
>
> this is closer, but it only works if cat and dog are on the same line.
>
> Any ideas? It's OSR507 (I don't think it matters).
> The people who read this newsgroup are so clever.
>
> Thanks,
> Kevin
find /tmp -type f |xargs fgrep -l cat |xargs fgrep -l dog
or
find /tmp -type f |xargs -n 1 awk
'{if($0~"cat")C=1;if($0~"dog")D=1;if(C+D==2){print ARGV[1];exit}}'
explanation of the first way:
find produces a list of files (only files thanks to -type f)
the first xargs runs grep as many times as necessary to process all the
files, putting as many filenames in each command as possible
the output of the first xargs/grep is a list of files that have cat
this list goes into the second xargs, which runs grep as many times as
necessary grepping for dog
so all the cat files get searched a second time for dog
the output of the second xargs/grep is filenames that have cat and dog
fgrep is used instead of grep just because it's faster and works as long as
the search is for a simple string and not a regular expression.
explanation of the second way:
A more efficient way that doesn't involve two passes through some of the
files is possible using awk instead of grep.
A more readable version of the same awk code, placed into a seperate script
file
cat myscript
#!/usr/bin/awk -f
if ($0~"cat") C=1
if ($0~"dog") D=1
if (C+D==2) {print ARGV[1] ; exit }
And you feed that filenames one at a time with xargs -n 1
find /tmp -type f |xargs -n 1 myscript
Awk works on records, each line in the input file causes the script to run
once.
Variables retain their value across records, so if line one has cat but no
dog, then C=1 but D is still blank and so C+D != 2 fails and so the next
record of the file is read.
If the next line has neither cat or dog nothing changes and the next line is
read, if the next line has dog but no cat then D=1, C still = 1 from before,
and so this time the C+D=2 passes and the filename is printed and the script
exits. No sense reading through the rest of the file.
xargs then runs the script again for the next file in the list that find
produced.
The first way is shorter to type and easier to look at and understand, but
the second way might be more efficient.
And by now after I hit submit I bet one of the _real_ geniuses will have
posted some simple little egrep or perl syntax that puts this to shame
Brian K. White --
brian@aljex.com --
http://www.aljex.com/bkw/
+++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
filePro BBx Linux SCO FreeBSD #callahans Satriani Filk!