vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi List; I've just started working with a new client and they have amoung other issues with their databases a particular update that basically locks out users. The below query was running for over 6 hours this morning and the CPU load had climbed to a point where new connections simply hung waiting to connect. We had to kill the query to allow business users to connect to the applications that connect to the database, thus I could not post an explain analyze. In any case the query looks like this: update dat_customer_mailbox_counts set total_contacts = contacts.ct, total_contact_users = contacts.dct from ( select customer_id, count(*) as ct, count( distinct con.user_id ) as dct from dat_user_contacts con group by customer_id ) contacts where contacts.customer_id = dat_customer_mailbox_counts.customer_id Here's the latest counts from the system catalogs: dat_customer_mailbox_counts: 423 dat_user_contacts 59,469,476 And here's an explain plan: QUERY PLAN ---------------------------------------------------------------------------------------------------- Merge Join (cost=17118858.51..17727442.30 rows=155 width=90) Merge Cond: ("outer".customer_id = "inner".customer_id) -> GroupAggregate (cost=17118772.93..17727347.34 rows=155 width=8) -> Sort (cost=17118772.93..17270915.95 rows=60857208 width=8) Sort Key: con.customer_id -> Seq Scan on dat_user_contacts con (cost=0.00..7332483.08 rows=60857208 width=8) -> Sort (cost=85.57..88.14 rows=1026 width=74) Sort Key: dat_customer_mailbox_counts.customer_id -> Seq Scan on dat_customer_mailbox_counts (cost=0.00..34.26 rows=1026 width=74) (9 rows) Any thoughts, comments, Ideas for debugging, etc would be way helpful... Thanks in advance. /Kevin ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| ||||
| Kevin Kempter <kevin@kevinkempterllc.com> writes: > Merge Join (cost=17118858.51..17727442.30 rows=155 width=90) > Merge Cond: ("outer".customer_id = "inner".customer_id) > -> GroupAggregate (cost=17118772.93..17727347.34 rows=155 width=8) > -> Sort (cost=17118772.93..17270915.95 rows=60857208 width=8) > Sort Key: con.customer_id > -> Seq Scan on dat_user_contacts con (cost=0.00..7332483.08 > rows=60857208 width=8) > -> Sort (cost=85.57..88.14 rows=1026 width=74) > Sort Key: dat_customer_mailbox_counts.customer_id > -> Seq Scan on dat_customer_mailbox_counts (cost=0.00..34.26 > rows=1026 width=74) The planner, at least, thinks that all the time will go into the sort step. Sorting 60M rows is gonna take awhile :-(. What PG version is this? (8.2 has noticeably faster sort code than prior releases...) What have you got work_mem set to? Bad as the sort is, I suspect that the real problem is the count(distinct) operator, which is going to require *another* sort-and-uniq step for each customer_id group --- and judging by the rowcount estimates, at least some of those groups must be pretty large. (AFAIR this time is not counted in the planner estimates.) Again, work_mem would have an effect on how fast that goes. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |