Unix Technical Forum

LIKE query verses =

This is a discussion on LIKE query verses = within the Pgsql Performance forums, part of the PostgreSQL category; --> ************************************************** ************************************************** ********************* 1) EXPLAIN ANALYSE SELECT job_category.job_id,job.name,job.state,job.build_i d,cat.name as reporting_group FROM category,job_category,job,category as cat WHERE job.job_id=job_category.job_id AND job_category.category_id=category.category_id ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 11:27 AM
Karthikeyan Mahadevan
 
Posts: n/a
Default LIKE query verses =

************************************************** ************************************************** *********************
1)

EXPLAIN ANALYSE SELECT
job_category.job_id,job.name,job.state,job.build_i d,cat.name as
reporting_group
FROM category,job_category,job,category as cat
WHERE job.job_id=job_category.job_id
AND job_category.category_id=category.category_id
AND cat.build_id=category.build_id
AND category.name = 'build_id.pap3260-20070828_01'
AND cat.name like ('reporting_group.Tier2%');
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..291.53 rows=8 width=103) (actual
time=98.999..385.590 rows=100 loops=1)
-> Nested Loop (cost=0.00..250.12 rows=9 width=34) (actual
time=98.854..381.106 rows=100 loops=1)
-> Nested Loop (cost=0.00..123.22 rows=1 width=34) (actual
time=98.717..380.185 rows=1 loops=1)
-> Index Scan using idx_cat_by_name on category cat
(cost=0.00..5.97 rows=1 width=34) (actual time=95.834..245.276 rows=977
loops=1)
Index Cond: (((name)::text >= 'reporting'::character
varying) AND ((name)::text < 'reportinh'::character varying))
Filter: ((name)::text ~~
'reporting_group.Tier2%'::text)
-> Index Scan using idx_cat_by_bld_id on category
(cost=0.00..117.24 rows=1 width=8) (actual time=0.134..0.134 rows=0
loops=977)
Index Cond: ("outer".build_id = category.build_id)
Filter: ((name)::text =
'build_id.pap3260-20070828_01'::text)
-> Index Scan using idx_jcat_by_cat_id on job_category
(cost=0.00..126.00 rows=71 width=8) (actual time=0.126..0.569 rows=100
loops=1)
Index Cond: (job_category.category_id =
"outer".category_id)
-> Index Scan using job_pkey on job (cost=0.00..4.59 rows=1 width=73)
(actual time=0.033..0.036 rows=1 loops=100)
Index Cond: (job.job_id = "outer".job_id)

Total runtime: 385.882 ms
------------------------------------------------------------------------------------------------------------------------------------------------------



but , if I use AND cat.name = 'reporting_group.Tier2' ;

************************************************** ************************************************** *********************
2)

EXPLAIN ANALYSE SELECT
job_category.job_id,job.name,job.state,job.build_i d,cat.name as
reporting_group
FROM category,job_category,job,category as cat
WHERE job.job_id=job_category.job_id
AND job_category.category_id=category.category_id
AND cat.build_id=category.build_id
AND category.name = 'build_id.pap3260-20070828_01'
AND cat.name = 'reporting_group.Tier2' ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=8186.96..26124.40 rows=796 width=103) (actual
time=40.584..48.966 rows=100 loops=1)
-> Nested Loop (cost=8186.96..21776.35 rows=945 width=34) (actual
time=40.445..41.437 rows=100 loops=1)
-> Merge Join (cost=8186.96..8198.88 rows=107 width=34) (actual
time=40.290..40.303 rows=1 loops=1)
Merge Cond: ("outer".build_id = "inner".build_id)
-> Sort (cost=4093.48..4096.19 rows=1085 width=8) (actual
time=0.206..0.211 rows=3 loops=1)
Sort Key: category.build_id
-> Index Scan using idx_cat_by_name on category
(cost=0.00..4038.78 rows=1085 width=8) (actual time=0.130..0.183 rows=3
loops=1)
Index Cond: ((name)::text =
'build_id.pap3260-20070828_01'::text)
-> Sort (cost=4093.48..4096.19 rows=1085 width=34)
(actual time=37.424..38.591 rows=956 loops=1)
Sort Key: cat.build_id
-> Index Scan using idx_cat_by_name on category cat
(cost=0.00..4038.78 rows=1085 width=34) (actual time=0.076..34.328
rows=962 loops=1)
Index Cond: ((name)::text =
'reporting_group.Tier2'::text)
-> Index Scan using idx_jcat_by_cat_id on job_category
(cost=0.00..126.00 rows=71 width=8) (actual time=0.139..0.743 rows=100
loops=1)
Index Cond: (job_category.category_id =
"outer".category_id)
-> Index Scan using job_pkey on job (cost=0.00..4.59 rows=1 width=73)
(actual time=0.063..0.066 rows=1 loops=100)
Index Cond: (job.job_id = "outer".job_id)
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Total runtime: 49.453 ms

How to increase the performance of the first query ?



















Thank you !
Regards,
Karthi
-------------------------------------------------------------------
Karthikeyan Mahadevan
Java Technology Center
IBM Software Labs ,Bangalore, India.
Phone: +91 80 2504 4000 or 2509 4000 Ext: 2413
Direct : +91 80 25094413
Email : kamahade@in.ibm.com
"Doesn't expecting the unexpected make the unexpected become the expected?
"
----------------------------------------------------------------------------

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 11:27 AM
Mark Lewis
 
Posts: n/a
Default Re: LIKE query verses =

On Wed, 2007-08-29 at 18:01 +0530, Karthikeyan Mahadevan wrote:
>
> ************************************************** ************************************************** *********************
> 1)
>
> EXPLAIN ANALYSE SELECT
> job_category.job_id,job.name,job.state,job.build_i d,cat.name as
> reporting_group
> FROM category,job_category,job,category as cat
> WHERE job.job_id=job_category.job_id
> AND job_category.category_id=category.category_id
> AND cat.build_id=category.build_id
> AND category.name = 'build_id.pap3260-20070828_01'
> AND cat.name like ('reporting_group.Tier2%');
>
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=0.00..291.53 rows=8 width=103) (actual
> time=98.999..385.590 rows=100 loops=1)
> -> Nested Loop (cost=0.00..250.12 rows=9 width=34) (actual
> time=98.854..381.106 rows=100 loops=1)
> -> Nested Loop (cost=0.00..123.22 rows=1 width=34) (actual
> time=98.717..380.185 rows=1 loops=1)
> -> Index Scan using idx_cat_by_name on category cat
> (cost=0.00..5.97 rows=1 width=34) (actual time=95.834..245.276
> rows=977 loops=1)
> Index Cond: (((name)::text >=
> 'reporting'::character varying) AND ((name)::text <
> 'reportinh'::character varying))
> Filter: ((name)::text ~~
> 'reporting_group.Tier2%'::text)
> -> Index Scan using idx_cat_by_bld_id on category
> (cost=0.00..117.24 rows=1 width=8) (actual time=0.134..0.134 rows=0
> loops=977)
> Index Cond: ("outer".build_id =
> category.build_id)
> Filter: ((name)::text =
> 'build_id.pap3260-20070828_01'::text)
> -> Index Scan using idx_jcat_by_cat_id on job_category
> (cost=0.00..126.00 rows=71 width=8) (actual time=0.126..0.569
> rows=100 loops=1)
> Index Cond: (job_category.category_id =
> "outer".category_id)
> -> Index Scan using job_pkey on job (cost=0.00..4.59 rows=1
> width=73) (actual time=0.033..0.036 rows=1 loops=100)
> Index Cond: (job.job_id = "outer".job_id)
>
> Total runtime: 385.882 ms
> ------------------------------------------------------------------------------------------------------------------------------------------------------


Remember that using LIKE causes PG to interpret an underscore as 'any
character', which means that it can only scan the index for all records
that start with 'reporting', and then it needs to apply a filter to each
match. This is going to be slower than just going directly to the
matching index entry.

What you probably want to do is tell PG that you're looking for a
literal underscore and not for any matching character by escaping the
underscore, that will allow it to do a much quicker index scan.
Something like:

cat.name like 'reporting|_group.Tier2%' ESCAPE '|'

-- Mark Lewis

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 11:27 AM
Tom Lane
 
Posts: n/a
Default Re: LIKE query verses =

Mark Lewis <mark.lewis@mir3.com> writes:
> What you probably want to do is tell PG that you're looking for a
> literal underscore and not for any matching character by escaping the
> underscore, that will allow it to do a much quicker index scan.


The other half of the problem is that the planner is drastically
misestimating the number of matching rows --- it thinks only one
when there are really about a thousand, and this leads it to use
a nestloop that will be very inefficient with so many rows.
Try increasing the statistics target for that column. Also, if
this is a pre-8.2 PG release, consider upgrading; I believe we
improved the LIKE estimator in 8.2.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 08:32 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com