Unix Technical Forum

SEO

vBulletin Search Engine Optimization


Go Back   Unix Technical Forum > Unix Operating Systems > Sco Unix

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-15-2008, 05:44 PM
Kevin Fleming
 
Posts: n/a
Default help with grep looking for cats and dogs

Hey Everybody,

I've been searching high and low for how to grep for two different
strings at once, and I'm not sure that it can be done. Ideas?
I've got a bunch of files and I'm searching for ones that have both the
word "cat" and "dog" in them, not necessarily on the same line.
I've tried using something like this:

find /tmp -exec grep -q -E "cat" {} \; -print

but I can't seem to do both cat and dog at the same time.

find /tmp -exec grep -q -E "cat.\dog" {} \; -print

this is closer, but it only works if cat and dog are on the same line.

Any ideas? It's OSR507 (I don't think it matters).
The people who read this newsgroup are so clever.

Thanks,
Kevin

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-15-2008, 05:44 PM
Bill Campbell
 
Posts: n/a
Default Re: help with grep looking for cats and dogs

On Thu, Jan 26, 2006, Kevin Fleming wrote:
>Hey Everybody,
>
>I've been searching high and low for how to grep for two different
>strings at once, and I'm not sure that it can be done. Ideas?
>I've got a bunch of files and I'm searching for ones that have both the
>word "cat" and "dog" in them, not necessarily on the same line.
>I've tried using something like this:


You want egrep, not grep:

egrep 'pattern1|pattern2' ...

If you want files with both these patterns, then things are more
complicated, and I'm not sure that there's a program in the grep family
that will do it in one pass. I would probably do it this way.

find ... | xargs grep -l 'pattern1' > /tmp/list1
xargs grep -l 'pattern2' < /tmp/list1 > /tmp/listfinal

Bill
--
INTERNET: bill@Celestial.COM Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/ PO Box 820; 6641 E. Mercer Way
FAX: (206) 232-9186 Mercer Island, WA 98040-0820; (206) 236-1676

``Virtually everything is under federal control nowadays except the
federal budget.''
-- Herman E. Talmadge, 1975
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-15-2008, 05:44 PM
Bela Lubkin
 
Posts: n/a
Default Re: help with grep looking for cats and dogs

Kevin Fleming wrote:

> I've been searching high and low for how to grep for two different
> strings at once, and I'm not sure that it can be done. Ideas?
> I've got a bunch of files and I'm searching for ones that have both the
> word "cat" and "dog" in them, not necessarily on the same line.
> I've tried using something like this:
>
> find /tmp -exec grep -q -E "cat" {} \; -print
>
> but I can't seem to do both cat and dog at the same time.
>
> find /tmp -exec grep -q -E "cat.\dog" {} \; -print
>
> this is closer, but it only works if cat and dog are on the same line.


Start by learning how to search a single file for multiple strings; get
rid of the `find` part of this equation.

You are using `grep -E`, which is the newfangled name for `egrep`.
Either will work and I'm going to use `egrep` here because I think it
shows the differences more clearly. I'll search /etc/termcap and I'll
use "cat" and "man" because both of those strings appear in
/etc/termcap.

So. Regular `grep` (and `egrep` and `fgrep`) will search for multiple
expressions given as multiple lines in the search string:

$ grep 'cat
man' /etc/termcap

`egrep` adds alternation:

$ egrep 'cat|man' /etc/termcap

This should produce the same results, but (in the case of OSR5)
alternation is a lot slower. That's due to an ancient library bug which
was never fixed in OSR5. I don't know if OSR6 is better. Because of
this bug, I always use the separate-lines syntax for simple alternation.
For complex alternation: "(cat|man).*bites.*(rat|dog)", I use the '|'
syntax and live with the lame performance.

Putting `find` back in the mix:

$ find /tmp -exec grep -l 'cat
dog' {} \;

`grep -l` means "print only the names of matching files". This should
have the same effect you were trying to get with `grep -q` followed by
"-print", but seems more direct to me.

>Bela<

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-15-2008, 05:44 PM
Brian K. White
 
Posts: n/a
Default Re: help with grep looking for cats and dogs


----- Original Message -----
From: "Kevin Fleming" <kevintickle@gmail.com>
Newsgroups: comp.unix.sco.misc
To: <distro@jpr.com>
Sent: Thursday, January 26, 2006 5:08 PM
Subject: help with grep looking for cats and dogs


> Hey Everybody,
>
> I've been searching high and low for how to grep for two different
> strings at once, and I'm not sure that it can be done. Ideas?
> I've got a bunch of files and I'm searching for ones that have both the
> word "cat" and "dog" in them, not necessarily on the same line.
> I've tried using something like this:
>
> find /tmp -exec grep -q -E "cat" {} \; -print
>
> but I can't seem to do both cat and dog at the same time.
>
> find /tmp -exec grep -q -E "cat.\dog" {} \; -print
>
> this is closer, but it only works if cat and dog are on the same line.
>
> Any ideas? It's OSR507 (I don't think it matters).
> The people who read this newsgroup are so clever.
>
> Thanks,
> Kevin


find /tmp -type f |xargs fgrep -l cat |xargs fgrep -l dog

or

find /tmp -type f |xargs -n 1 awk
'{if($0~"cat")C=1;if($0~"dog")D=1;if(C+D==2){print ARGV[1];exit}}'


explanation of the first way:
find produces a list of files (only files thanks to -type f)
the first xargs runs grep as many times as necessary to process all the
files, putting as many filenames in each command as possible
the output of the first xargs/grep is a list of files that have cat
this list goes into the second xargs, which runs grep as many times as
necessary grepping for dog
so all the cat files get searched a second time for dog
the output of the second xargs/grep is filenames that have cat and dog
fgrep is used instead of grep just because it's faster and works as long as
the search is for a simple string and not a regular expression.

explanation of the second way:
A more efficient way that doesn't involve two passes through some of the
files is possible using awk instead of grep.
A more readable version of the same awk code, placed into a seperate script
file

cat myscript
#!/usr/bin/awk -f
if ($0~"cat") C=1
if ($0~"dog") D=1
if (C+D==2) {print ARGV[1] ; exit }

And you feed that filenames one at a time with xargs -n 1
find /tmp -type f |xargs -n 1 myscript

Awk works on records, each line in the input file causes the script to run
once.
Variables retain their value across records, so if line one has cat but no
dog, then C=1 but D is still blank and so C+D != 2 fails and so the next
record of the file is read.
If the next line has neither cat or dog nothing changes and the next line is
read, if the next line has dog but no cat then D=1, C still = 1 from before,
and so this time the C+D=2 passes and the filename is printed and the script
exits. No sense reading through the rest of the file.
xargs then runs the script again for the next file in the list that find
produced.

The first way is shorter to type and easier to look at and understand, but
the second way might be more efficient.

And by now after I hit submit I bet one of the _real_ geniuses will have
posted some simple little egrep or perl syntax that puts this to shame

Brian K. White -- brian@aljex.com -- http://www.aljex.com/bkw/
+++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
filePro BBx Linux SCO FreeBSD #callahans Satriani Filk!

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-15-2008, 05:44 PM
Brian K. White
 
Posts: n/a
Default Re: help with grep looking for cats and dogs


----- Original Message -----
From: "Bela Lubkin" <filbo@armory.com>
Newsgroups: comp.unix.sco.misc
To: <distro@jpr.com>
Sent: Thursday, January 26, 2006 6:07 PM
Subject: Re: help with grep looking for cats and dogs


> Kevin Fleming wrote:
>
>> I've been searching high and low for how to grep for two different
>> strings at once, and I'm not sure that it can be done. Ideas?
>> I've got a bunch of files and I'm searching for ones that have both the
>> word "cat" and "dog" in them, not necessarily on the same line.
>> I've tried using something like this:
>>
>> find /tmp -exec grep -q -E "cat" {} \; -print
>>
>> but I can't seem to do both cat and dog at the same time.
>>
>> find /tmp -exec grep -q -E "cat.\dog" {} \; -print
>>
>> this is closer, but it only works if cat and dog are on the same line.

>
> Start by learning how to search a single file for multiple strings; get
> rid of the `find` part of this equation.
>
> You are using `grep -E`, which is the newfangled name for `egrep`.
> Either will work and I'm going to use `egrep` here because I think it
> shows the differences more clearly. I'll search /etc/termcap and I'll
> use "cat" and "man" because both of those strings appear in
> /etc/termcap.
>
> So. Regular `grep` (and `egrep` and `fgrep`) will search for multiple
> expressions given as multiple lines in the search string:
>
> $ grep 'cat
> man' /etc/termcap
>
> `egrep` adds alternation:
>
> $ egrep 'cat|man' /etc/termcap
>
> This should produce the same results, but (in the case of OSR5)
> alternation is a lot slower. That's due to an ancient library bug which
> was never fixed in OSR5. I don't know if OSR6 is better. Because of
> this bug, I always use the separate-lines syntax for simple alternation.
> For complex alternation: "(cat|man).*bites.*(rat|dog)", I use the '|'
> syntax and live with the lame performance.
>
> Putting `find` back in the mix:
>
> $ find /tmp -exec grep -l 'cat
> dog' {} \;
>
> `grep -l` means "print only the names of matching files". This should
> have the same effect you were trying to get with `grep -q` followed by
> "-print", but seems more direct to me.
>
>>Bela<


*sigh* I predicted this would happen.
(see my response to this post, in case you get this out of order and it
doesn't make sense)

Brian K. White -- brian@aljex.com -- http://www.aljex.com/bkw/
+++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
filePro BBx Linux SCO FreeBSD #callahans Satriani Filk!

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-15-2008, 05:44 PM
Bela Lubkin
 
Posts: n/a
Default Re: help with grep looking for cats and dogs

Brian K. White wrote:

> From: "Kevin Fleming" <kevintickle@gmail.com>


> > I've been searching high and low for how to grep for two different
> > strings at once, and I'm not sure that it can be done. Ideas?
> > I've got a bunch of files and I'm searching for ones that have both the
> > word "cat" and "dog" in them, not necessarily on the same line.
> > I've tried using something like this:
> >
> > find /tmp -exec grep -q -E "cat" {} \; -print
> >
> > but I can't seem to do both cat and dog at the same time.
> >
> > find /tmp -exec grep -q -E "cat.\dog" {} \; -print
> >
> > this is closer, but it only works if cat and dog are on the same line.


> find /tmp -type f |xargs fgrep -l cat |xargs fgrep -l dog
>
> or
>
> find /tmp -type f |xargs -n 1 awk
> '{if($0~"cat")C=1;if($0~"dog")D=1;if(C+D==2){print ARGV[1];exit}}'


Whoops... my response would only find files with both words on the same
line. It is not my usual habit to misread things like that. Oh well.

Both of your ways should work. I think `awk` will process this
equivalent code slightly more efficiently:

awk '/cat/ { C = 1 }
/dog/ { D = 1 }
C + D == 1 { print ARGV[1]; exit }'

> A more readable version of the same awk code, placed into a seperate script
> file
>
> cat myscript
> #!/usr/bin/awk -f
> if ($0~"cat") C=1
> if ($0~"dog") D=1
> if (C+D==2) {print ARGV[1] ; exit }


Almost ... you need to put braces around all the code:

#!/usr/bin/awk -f
{
if ($0~"cat") C=1
if ($0~"dog") D=1
if (C+D==2) {print ARGV[1] ; exit }
}

>Bela<

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 02-15-2008, 05:44 PM
Bob Bailin
 
Posts: n/a
Default Re: help with grep looking for cats and dogs


"Brian K. White" <brian@aljex.com> wrote in message
news:00df01c622d2$954fcf70$951fa8c0@venti...
>
> ----- Original Message -----
> From: "Bela Lubkin" <filbo@armory.com>
> Newsgroups: comp.unix.sco.misc
> To: <distro@jpr.com>
> Sent: Thursday, January 26, 2006 6:07 PM
> Subject: Re: help with grep looking for cats and dogs
>
>
> > Kevin Fleming wrote:
> >
> >> I've been searching high and low for how to grep for two different
> >> strings at once, and I'm not sure that it can be done. Ideas?
> >> I've got a bunch of files and I'm searching for ones that have both the
> >> word "cat" and "dog" in them, not necessarily on the same line.
> >> I've tried using something like this:
> >>
> >> find /tmp -exec grep -q -E "cat" {} \; -print
> >>
> >> but I can't seem to do both cat and dog at the same time.
> >>
> >> find /tmp -exec grep -q -E "cat.\dog" {} \; -print
> >>
> >> this is closer, but it only works if cat and dog are on the same line.

> >
> > Start by learning how to search a single file for multiple strings; get
> > rid of the `find` part of this equation.
> >
> > You are using `grep -E`, which is the newfangled name for `egrep`.
> > Either will work and I'm going to use `egrep` here because I think it
> > shows the differences more clearly. I'll search /etc/termcap and I'll
> > use "cat" and "man" because both of those strings appear in
> > /etc/termcap.
> >
> > So. Regular `grep` (and `egrep` and `fgrep`) will search for multiple
> > expressions given as multiple lines in the search string:
> >
> > $ grep 'cat
> > man' /etc/termcap
> >
> > `egrep` adds alternation:
> >
> > $ egrep 'cat|man' /etc/termcap
> >
> > This should produce the same results, but (in the case of OSR5)
> > alternation is a lot slower. That's due to an ancient library bug which
> > was never fixed in OSR5. I don't know if OSR6 is better. Because of
> > this bug, I always use the separate-lines syntax for simple alternation.
> > For complex alternation: "(cat|man).*bites.*(rat|dog)", I use the '|'
> > syntax and live with the lame performance.
> >
> > Putting `find` back in the mix:
> >
> > $ find /tmp -exec grep -l 'cat
> > dog' {} \;
> >
> > `grep -l` means "print only the names of matching files". This should
> > have the same effect you were trying to get with `grep -q` followed by
> > "-print", but seems more direct to me.
> >
> >>Bela<

>
> *sigh* I predicted this would happen.
> (see my response to this post, in case you get this out of order and it
> doesn't make sense)
>


Actually, Bela's solution gives a list of all files that contain "cat OR
dog".
The original poster was looking for a list of all files that contain "cat
AND dog",
(anywhere in the file) and your solution (given in the other post) is
correct.

Bob


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 02-15-2008, 05:44 PM
John DuBois
 
Posts: n/a
Default Re: help with grep looking for cats and dogs

In article <200601261711.aa12334@deepthought.armory.com>,
Bela Lubkin <filbo@armory.com> wrote:
>> From: "Kevin Fleming" <kevintickle@gmail.com>
>> > I've been searching high and low for how to grep for two different
>> > strings at once, and I'm not sure that it can be done. Ideas?
>> > I've got a bunch of files and I'm searching for ones that have both the
>> > word "cat" and "dog" in them, not necessarily on the same line.

>
>Both of your ways should work. I think `awk` will process this
>equivalent code slightly more efficiently:
>
> awk '/cat/ { C = 1 }
> /dog/ { D = 1 }
> C + D == 1 { print ARGV[1]; exit }'

^
^ s/b 2

I say:

find /tmp -type f |xargs awk '
FNR == 1 { c = d = 0 }
/cat/ { c = 1 }
/dog/ { d = 1 }
c && d { print FILENAME; next }'

John
--
John DuBois spcecdt@armory.com KC6QKZ/AE http://www.armory.com/~spcecdt/
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 02-15-2008, 05:44 PM
jdanskinner
 
Posts: n/a
Default Re: help with grep looking for cats and dogs


Bela Lubkin wrote:
> Brian K. White wrote:
>
> > From: "Kevin Fleming" <kevintickle@gmail.com>

>
> > > I've been searching high and low for how to grep for two different
> > > strings at once, and I'm not sure that it can be done. Ideas?
> > > I've got a bunch of files and I'm searching for ones that have both the
> > > word "cat" and "dog" in them, not necessarily on the same line.
> > > I've tried using something like this:
> > >
> > > find /tmp -exec grep -q -E "cat" {} \; -print
> > >
> > > but I can't seem to do both cat and dog at the same time.
> > >
> > > find /tmp -exec grep -q -E "cat.\dog" {} \; -print
> > >
> > > this is closer, but it only works if cat and dog are on the same line.

>
> > find /tmp -type f |xargs fgrep -l cat |xargs fgrep -l dog
> >
> > or
> >
> > find /tmp -type f |xargs -n 1 awk
> > '{if($0~"cat")C=1;if($0~"dog")D=1;if(C+D==2){print ARGV[1];exit}}'

>
> Whoops... my response would only find files with both words on the same
> line. It is not my usual habit to misread things like that. Oh well.
>
> Both of your ways should work. I think `awk` will process this
> equivalent code slightly more efficiently:
>
> awk '/cat/ { C = 1 }
> /dog/ { D = 1 }
> C + D == 1 { print ARGV[1]; exit }'
>
> > A more readable version of the same awk code, placed into a seperate script
> > file
> >
> > cat myscript
> > #!/usr/bin/awk -f
> > if ($0~"cat") C=1
> > if ($0~"dog") D=1
> > if (C+D==2) {print ARGV[1] ; exit }

>
> Almost ... you need to put braces around all the code:
>
> #!/usr/bin/awk -f
> {
> if ($0~"cat") C=1
> if ($0~"dog") D=1
> if (C+D==2) {print ARGV[1] ; exit }
> }
>
> >Bela<


Simple and crude but easy:
grep -l dog `grep -l cat /tmp/*`
or
grep -l dog /tmp/* | xargs grep -l cat
or if you must find
find /tmp -name "*" -exec grep -l dog {} \; | xargs grep -l cat

Regards...Dan.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 02-15-2008, 05:44 PM
Kevin Fleming
 
Posts: n/a
Default Re: help with grep looking for cats and dogs

jdanskinner wrote:
> Bela Lubkin wrote:
> > Brian K. White wrote:
> >
> > > From: "Kevin Fleming" <kevintickle@gmail.com>

> >
> > > > I've been searching high and low for how to grep for two different
> > > > strings at once, and I'm not sure that it can be done. Ideas?
> > > > I've got a bunch of files and I'm searching for ones that have both the
> > > > word "cat" and "dog" in them, not necessarily on the same line.
> > > > I've tried using something like this:
> > > >
> > > > find /tmp -exec grep -q -E "cat" {} \; -print
> > > >
> > > > but I can't seem to do both cat and dog at the same time.
> > > >
> > > > find /tmp -exec grep -q -E "cat.\dog" {} \; -print
> > > >
> > > > this is closer, but it only works if cat and dog are on the same line.

> >
> > > find /tmp -type f |xargs fgrep -l cat |xargs fgrep -l dog
> > >
> > > or
> > >
> > > find /tmp -type f |xargs -n 1 awk
> > > '{if($0~"cat")C=1;if($0~"dog")D=1;if(C+D==2){print ARGV[1];exit}}'

> >
> > Whoops... my response would only find files with both words on the same
> > line. It is not my usual habit to misread things like that. Oh well.
> >
> > Both of your ways should work. I think `awk` will process this
> > equivalent code slightly more efficiently:
> >
> > awk '/cat/ { C = 1 }
> > /dog/ { D = 1 }
> > C + D == 1 { print ARGV[1]; exit }'
> >
> > > A more readable version of the same awk code, placed into a seperate script
> > > file
> > >
> > > cat myscript
> > > #!/usr/bin/awk -f
> > > if ($0~"cat") C=1
> > > if ($0~"dog") D=1
> > > if (C+D==2) {print ARGV[1] ; exit }

> >
> > Almost ... you need to put braces around all the code:
> >
> > #!/usr/bin/awk -f
> > {
> > if ($0~"cat") C=1
> > if ($0~"dog") D=1
> > if (C+D==2) {print ARGV[1] ; exit }
> > }
> >
> > >Bela<

>
> Simple and crude but easy:
> grep -l dog `grep -l cat /tmp/*`
> or
> grep -l dog /tmp/* | xargs grep -l cat
> or if you must find
> find /tmp -name "*" -exec grep -l dog {} \; | xargs grep -l cat
>
> Regards...Dan.




Thanks to everyone for their help on this. I'm stilll learning how
scripts and the syntax works...
Brian's suggestion:
find /tmp -type f |xargs fgrep -l cat |xargs fgrep -l dog
was the simplest for me to understand, even if it wasn't the most
efficient. I promise to spend some time learning about shell scripts,
and I appreciate the explanations with the examples.

Thanks again,
Kevin

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 07:58 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
UnixAdminTalk.com

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433