This is a discussion on OUTER JOIN performance regression remains in 8.3beta4 within the pgsql Hackers forums, part of the PostgreSQL category; --> There was a serious performance regression in OUTER JOIN planning going from 8.2.4 to 8.2.5. I know Tom came ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| There was a serious performance regression in OUTER JOIN planning going from 8.2.4 to 8.2.5. I know Tom came up with some patches to mitigate the issues in 8.2.5, but my testing shows that problems remain in 8.3beta4. The query: SELECT "CH"."caseNo", "CH"."countyNo", "CH"."chargeNo", "CH"."statuteCite", "CH"."sevClsCode", "CH"."modSevClsCode", "S"."descr" AS "sevClsCodeDescr", "M"."descr" AS "modSevClsCodeDescr", "CH"."descr", "CH"."offenseDate", "CH"."pleaCode", "PC"."descr" AS "pleaCodeDescr", "CH"."pleaDate", "CH"."chargeSeqNo", "CHST"."eventDate" AS "reopEventDate", "CTHE"."descr" AS "reopEventDescr" FROM "Charge" "CH" LEFT OUTER JOIN "SevClsCode" "S" ON ("S"."sevClsCode" = "CH"."sevClsCode") LEFT OUTER JOIN "SevClsCode" "M" ON ("M"."sevClsCode" = "CH"."modSevClsCode") LEFT OUTER JOIN "PleaCode" "PC" ON ("PC"."pleaCode" = "CH"."pleaCode") LEFT OUTER JOIN "CaseHist" "CHST" ON ( "CHST"."countyNo" = "CH"."countyNo" AND "CHST"."caseNo" = "CH"."caseNo" AND "CHST"."histSeqNo" = "CH"."reopHistSeqNo" ) LEFT OUTER JOIN "CaseTypeHistEvent" "CTHE" ON ( "CHST"."eventType" = "CTHE"."eventType" AND "CHST"."caseType" = "CTHE"."caseType" AND "CHST"."countyNo" = "CTHE"."countyNo" ) WHERE "CH"."caseNo" = '2007CM003476' AND "CH"."countyNo" = 53 ORDER BY "chargeNo", "chargeSeqNo" ; The attached EXPLAIN ANALYZE results show: (1) A run of the above under 8.3beta4. (2) A run of the above under 8.2.4. (3) A run of the above with all OUTER JOINs changed to INNER under 8.3beta4. -Kevin ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| >>> On Fri, Jan 4, 2008 at 12:16 PM, in message <477E238D.EE98.0025.0@wicourts.gov>, "Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > problems remain in 8.3beta4. As I poked around at this, it started to seem familiar. I had previously posted about this query's performance under 8.2.4. http://archives.postgresql.org/pgsql...0/msg00087.php To summarize, under 8.2.4 it runs in less than half a millisecond if I set enable_hashjoin = off and thousand times that long with enable_hashjoin = on. Under 8.3beta4 this has jumped to 113114 ms. -Kevin ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > There was a serious performance regression in OUTER JOIN planning > going from 8.2.4 to 8.2.5. I know Tom came up with some patches to > mitigate the issues in 8.2.5, but my testing shows that problems > remain in 8.3beta4. Can't do much with this without seeing the table and view definitions involved. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > As I poked around at this, it started to seem familiar. I had > previously posted about this query's performance under 8.2.4. > http://archives.postgresql.org/pgsql...0/msg00087.php Well, that thread gave some of the missing details, such as Table "public.ControlRecord" Column | Type | Modifiers --------------------+------------------------+----------- countyNo | "CountyNoT" | not null but what the heck is "CountyNoT"? It looks like 8.3 is failing to realize that it can propagate the countyNo = 53 condition down to this table's scan, as 8.2 did; but there's not enough details here to guess why not. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| >>> On Fri, Jan 4, 2008 at 4:29 PM, in message <23789.1199485772@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Can't do much with this without seeing the table and view definitions > involved. Understood. It was while I was putting that together that it struck me as familiar. They are the same as in this thread. http://archives.postgresql.org/pgsql...0/msg00087.php I didn't figure there was much point re-posting them versus referencing that thread, but I can do so if it's helpful. -Kevin ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| >>> On Fri, Jan 4, 2008 at 4:40 PM, in message <23936.1199486422@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > what the heck is "CountyNoT"? bigbird=# \dD "CountyNoT" List of domains Schema | Name | Type | Modifier | Check --------+-----------+----------+----------+------- public | CountyNoT | smallint | | Should I post a list of all the domains in these tables, or is that one enough? -Kevin ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > On Fri, Jan 4, 2008 at 4:29 PM, in message <23789.1199485772@sss.pgh.pa.us>, > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Can't do much with this without seeing the table and view definitions >> involved. > Understood. It was while I was putting that together that it > struck me as familiar. They are the same as in this thread. > http://archives.postgresql.org/pgsql...0/msg00087.php I don't see anything in that thread that shows the view definition nor mentions the data types involved. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| >>> On Fri, Jan 4, 2008 at 4:46 PM, in message <24033.1199486819@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I don't see anything in that thread that shows the view definition It was in the first post on the other thread, but for convenience: bigbird=# \d "CaseTypeHistEvent" View "public.CaseTypeHistEvent" Column | Type | Modifiers ----------------+---------------+----------- caseType | "CaseTypeT" | eventType | "EventTypeT" | descr | "EventDescrT" | isActive | boolean | isKeyEvent | boolean | isMoneyEnabled | boolean | keyEventSeqNo | integer | countyNo | "CountyNoT" | View definition: SELECT b."caseType", b."eventType", b.descr, b."isActive", CASE WHEN d."eventType" IS NOT NULL THEN d."isKeyEvent" ELSE b."isKeyEvent" END AS "isKeyEvent", CASE WHEN d."eventType" IS NOT NULL THEN d."isMoneyEnabled" ELSE b."isMoneyEnabled" END AS "isMoneyEnabled", COALESCE( CASE WHEN d."eventType" IS NOT NULL THEN d."keyEventSeqNo"::smallint ELSE b."keyEventSeqNo"::smallint END::integer, 0) AS "keyEventSeqNo", c."countyNo" FROM ONLY "CaseTypeHistEventB" b JOIN ONLY "ControlRecord" c ON 1 = 1 LEFT JOIN ONLY "CaseTypeHistEventD" d ON d."caseType"::bpchar = b."caseType"::bpchar AND d."eventType"::bpchar = b."eventType"::bpchar AND d."countyNo"::smallint = c."countyNo"::smallint; > nor mentions the data types involved. I'll pull those together and post shortly. All are domains without modifiers or checks. -Kevin ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| >>> On Fri, Jan 4, 2008 at 4:46 PM, in message <24033.1199486819@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > nor mentions the data types involved. Schema | Name | Type | Modifier | Check --------+-------------------+-------------------------+----------+------- public | ArrestCaseNoT | character(12) | | public | ArrestTrackingNoT | character(14) | | public | BookCaseNoT | character(12) | | public | CalDurationT | numeric(6,2) | | public | CaseNoT | character(14) | | public | CaseTypeT | character(2) | | public | ChargeIdT | character varying(15) | | public | ChargeNoT | smallint | | public | ChargeSeqNoT | smallint | | public | ChargeStatusCodeT | character varying(2) | | public | CountyNoT | smallint | | public | CtofcNoT | character(4) | | public | DateT | date | | public | DispoCodeT | character(5) | | public | EventDescrT | character(50) | | public | EventTypeT | character(5) | | public | HistSeqNoT | smallint | | public | IssAgencyNoT | smallint | | public | JdgmtSeqNoT | smallint | | public | KeyEventSeqT | smallint | | public | MoneyT | numeric(13,2) | | public | OffenseDateRangeT | character varying(100) | | public | PdCodeT | character(2) | | public | PleaCodeT | character(4) | | public | PlntfAgencyNoT | smallint | | public | SccaCaseNoT | character varying(14) | | public | SevClsCodeT | character(2) | | public | StatuteCiteT | character(21) | | public | StatuteDescrT | character varying(100) | | public | StatuteSevSeqNoT | smallint | | public | TagTypeT | character(2) | | public | TapeCounterNoT | character(16) | | public | TapeLocT | character(18) | | public | TextT | character varying(2000) | | public | UserIdT | character(8) | | public | WcisClsCodeT | character(5) | | We're in the process of converting all of the char to varchar, but both the previous report and this one still have char as shown here. -Kevin ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| ||||
| I wrote: > [ Pokes at older branches... ] Oh, that's interesting, 8.1 seems to do > the right thing already! Seems that 8.1 does the right thing for the wrong reason :-(. Just like 8.2, it falsely concludes that the f3 = f1 clause can be deleted, but it fails to get rid of every copy of it. The reason is that initsplan.c puts the clause onto the joinlist of every rel mentioned in its required_relids, but the eventual remove_join_clause_from_rels() call only takes it off of the joinlists of rels specifically mentioned in the clause. In this test case, 8.1 thinks that t2 is part of the required_relids for the upper join's clause, so f3 = f1 survives on that rel's joinlist and eventually gets applied when its required_relids are satisfied. But 8.2 has outer-join rearrangement ability, so it correctly figures that the required_relids for f3 = f1 shouldn't include t2, and that means that the remove_join_clause_from_rels() call manages to knock off every copy of the clause. The net effect of this seems to be that 8.1 will preserve a copy of a "redundant" outer-join clause if that clause appeared above any rels that it didn't explicitly mention. This is certainly not the design intention, and it probably results in clauses being sometimes uselessly kept. But it prevents the bug at hand, so I'm not inclined to touch the logic in 8.1 unless we realize there's another bug there. Not sure yet what to do to fix 8.2. The whole business of adding and then deleting join clauses was pretty ugly, and I'm happy it's gone in 8.3; but it doesn't seem very prudent to try to change that basic approach in 8.2. Somehow generate_outer_join_implications() needs to be smarter about which clauses are really safe to remove. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |