This is a discussion on Query/database optimization within the Pgsql Performance forums, part of the PostgreSQL category; --> Hello, I have a query: explain analyze select tu.url_id, tu.url, coalesce(sd.recurse, 100), case when COALESCE(get_option('use_banner')::integer,0) = 0 then 0 ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hello, I have a query: explain analyze select tu.url_id, tu.url, coalesce(sd.recurse, 100), case when COALESCE(get_option('use_banner')::integer,0) = 0 then 0 else ts.use_banner end as use_banner, ts.use_cookies, ts.use_robots, ts.includes, ts.excludes, ts.track_domain, ts.task_id,get_available_pages(ts.task_id,ts.custo mer_id), ts.redirects from task_url tu inner join task_scheduler ts on tu.task_id=ts.task_id inner join (subscription s inner join subscription_dic sd on sd.id=s.dict_id ) on s.customer_id=ts.customer_id inner join customer c on c.customer_id=ts.customer_id AND c.active WHERE get_available_pages(ts.task_id,ts.customer_id) > 0 AND ((get_option('expired_users')::integer = 0) OR (isfinite(last_login) AND extract('day' from current_timestamp - last_login)::integer <= coalesce(get_option('expired_users')::integer,100) )) AND ((s.status is null AND ts.customer_id is null) OR s.status > 0) AND (get_check_period(ts.task_id,ts.next_check) is null OR (unix_timestamp(get_check_period(ts.task_id,ts.nex t_check)) - unix_timestamp(timenow()) < 3600)) AND ts.status <> 1 AND ((ts.start_time < current_time AND ts.stop_time > current_time) OR (ts.start_time is null AND ts.stop_time is null)) AND tu.url_id = 1 AND ts.customer_id not in (select distinct customer_id from task_scheduler where status = 1) order by ts.next_check is not null, unix_timestamp(ts.next_check) - unix_timestamp(timenow()) limit 10; which produces this query plan: Limit (cost=2874.98..2874.99 rows=2 width=88) (actual time=11800.535..11800.546 rows=3 loops=1) -> Sort (cost=2874.98..2874.99 rows=2 width=88) (actual time=11800.529..11800.532 rows=3 loops=1) Sort Key: (ts.next_check IS NOT NULL), (date_part('epoch'::text, ts.next_check) - date_part('epoch'::text, (timenow())::timestamp without time zone)) -> Nested Loop (cost=4.37..2874.97 rows=2 width=88) (actual time=10249.115..11800.486 rows=3 loops=1) -> Nested Loop (cost=4.37..2868.87 rows=2 width=55) (actual time=10247.721..11796.303 rows=3 loops=1) Join Filter: ("inner".id = "outer".dict_id) -> Nested Loop (cost=2.03..2865.13 rows=2 width=55) (actual time=10247.649..11796.142 rows=3 loops=1) Join Filter: ((("inner".status IS NULL) AND ("outer".customer_id IS NULL)) OR ("inner".status > 0)) -> Nested Loop (cost=2.03..2858.34 rows=2 width=55) (actual time=10247.583..11795.936 rows=3 loops=1) -> Seq Scan on customer c (cost=0.00..195.71 rows=231 width=4) (actual time=0.082..154.344 rows=4161 loops=1) Filter: (active AND isfinite(last_login) AND ((date_part('day'::text, (('now'::text)::timestamp(6) with time zone - (last_login)::timestamp with time zone)))::integer <= 150)) -> Index Scan using task_scheduler_icustomer_id on task_scheduler ts (cost=2.03..11.51 rows=1 width=51) (actual time=2.785..2.785 rows=0 loops=4161) Index Cond: ("outer".customer_id = ts.customer_id) Filter: ((get_available_pages(task_id, customer_id) > 0) AND ((get_check_period(task_id, next_check) IS NULL) OR ((date_part('epoch'::text, get_check_period(task_id, next_check)) - date_part('epoch'::text, (timenow())::timestamp without time zone)) < 3600::double precision)) AND (status <> 1) AND ((((start_time)::time with time zone < ('now'::text)::time(6) with time zone) AND ((stop_time)::time with time zone > ('now'::text)::time(6) with time zone)) OR ((start_time IS NULL) AND (stop_time IS NULL))) AND (NOT (hashed subplan))) SubPlan -> Unique (cost=2.02..2.03 rows=1 width=4) (actual time=0.617..0.631 rows=3 loops=1) -> Sort (cost=2.02..2.03 rows=1 width=4) (actual time=0.613..0.617 rows=3 loops=1) Sort Key: customer_id -> Index Scan using task_scheduler_istatus on task_scheduler (cost=0.00..2.01 rows=1 width=4) (actual time=0.044..0.580 rows=3 loops=1) Index Cond: (status = 1) -> Index Scan using subscription_icustomer_id on subscription s (cost=0.00..3.38 rows=1 width=12) (actual time=0.035..0.041 rows=1 loops=3) Index Cond: ("outer".customer_id = s.customer_id) -> Materialize (cost=2.34..2.65 rows=31 width=8) (actual time=0.008..0.027 rows=6 loops=3) -> Seq Scan on subscription_dic sd (cost=0.00..2.31 rows=31 width=8) (actual time=0.013..0.034 rows=6 loops=1) -> Index Scan using task_url_storage_task_id on task_url tu (cost=0.00..3.03 rows=1 width=37) (actual time=0.028..0.045 rows=1 loops=3) Index Cond: (tu.task_id = "outer".task_id) Filter: (url_id = 1) Total runtime: 11801.082 ms (28 rows) Do I need to optimize a query somehow, or it is related to database configuration? I'm running postgresql 8.0.0 on CentOS release 3.7 -- Eugene N Dzhurinsky ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Eugeny N Dzhurinsky <bofh@redwerk.com> writes: > [slow query] The bulk of your time seems to be going into this indexscan: > -> Index Scan using task_scheduler_icustomer_id on task_scheduler ts (cost=2.03..11.51 rows=1 width=51) (actual time=2.785..2.785 rows=0 loops=4161) > Index Cond: ("outer".customer_id = ts.customer_id) > Filter: ((get_available_pages(task_id, customer_id) > 0) AND ((get_check_period(task_id, next_check) IS NULL) OR ((date_part('epoch'::text, get_check_period(task_id, next_check)) - date_part('epoch'::text, (timenow())::timestamp without time zone)) < 3600::double precision)) AND (status <> 1) AND ((((start_time)::time with time zone < ('now'::text)::time(6) with time zone) AND ((stop_time)::time with time zone > ('now'::text)::time(6) with time zone)) OR ((start_time IS NULL) AND (stop_time IS NULL))) AND (NOT (hashed subplan))) > SubPlan > -> Unique (cost=2.02..2.03 rows=1 width=4) (actual time=0.617..0.631 rows=3 loops=1) > ... I kinda doubt that the index search itself is that slow --- doubtless the problem comes from having to evaluate that filter condition on a lot of rows. How fast are those functions you're calling? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| On Tue, Aug 01, 2006 at 11:15:11PM -0400, Tom Lane wrote: > Eugeny N Dzhurinsky <bofh@redwerk.com> writes: > > [slow query] > The bulk of your time seems to be going into this indexscan: > > -> Index Scan using task_scheduler_icustomer_id on task_scheduler ts (cost=2.03..11.51 rows=1 width=51) (actual time=2.785..2.785 rows=0 loops=4161) > > Index Cond: ("outer".customer_id = ts.customer_id) > > Filter: ((get_available_pages(task_id, customer_id) > 0) AND ((get_check_period(task_id, next_check) IS NULL) OR ((date_part('epoch'::text, get_check_period(task_id, next_check)) - date_part('epoch'::text, (timenow())::timestamp without time zone)) < 3600::double precision)) AND (status <> 1) AND ((((start_time)::time with time zone < ('now'::text)::time(6) with time zone) AND ((stop_time)::time with time zone > ('now'::text)::time(6) with time zone)) OR ((start_time IS NULL) AND (stop_time IS NULL))) AND (NOT (hashed subplan))) > > SubPlan > > -> Unique (cost=2.02..2.03 rows=1 width=4) (actual time=0.617..0.631 rows=3 loops=1) > > ... > > I kinda doubt that the index search itself is that slow --- doubtless > the problem comes from having to evaluate that filter condition on a lot > of rows. How fast are those functions you're calling? Well, not really fast, especially get_available_pages there is special table with history of changes, and there is a view for latest changes per task, and this function selects all records from a view for given ID, then calculates sum of pages of tasks and then calculates number of available pages as number of allowed pages deduct number of processed pages. probably there is bottleneck in this view selection? -- Eugene N Dzhurinsky ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |