This is a discussion on Statistics visibility in SERIALIZABLE transactions within the pgsql Hackers forums, part of the PostgreSQL category; --> Updates to planner statistics appear to be visible in SERIALIZABLE transactions even though updated data is not. Is this ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Updates to planner statistics appear to be visible in SERIALIZABLE transactions even though updated data is not. Is this intentional? Could that adversely affect query plans? CREATE TABLE test (x integer); INSERT INTO test (x) SELECT i % 2 FROM generate_series(1, 100) AS g(i); ANALYZE test; Transaction 1: BEGIN ISOLATION LEVEL SERIALIZABLE; EXPLAIN ANALYZE SELECT * FROM test WHERE x = 1; QUERY PLAN ------------------------------------------------------------------------------------------------ Seq Scan on test (cost=0.00..2.25 rows=50 width=4) (actual time=0.201..0.787 rows=50 loops=1) Filter: (x = 1) Total runtime: 1.169 ms (3 rows) Transaction 2: BEGIN; DELETE FROM test WHERE x = 1; ANALYZE test; COMMIT; Transaction 1: EXPLAIN ANALYZE SELECT * FROM test WHERE x = 1; QUERY PLAN ----------------------------------------------------------------------------------------------- Seq Scan on test (cost=0.00..1.62 rows=1 width=4) (actual time=0.499..1.090 rows=50 loops=1) Filter: (x = 1) Total runtime: 1.476 ms (3 rows) In Transaction 1's second query the planner uses an updated row count estimate even though the old rows are still visible. I think I understand why statistics like the total relation size and total number of tuples would help the planner, but is there a reason for distribution statistics to be visible for data that itself isn't visible? Thanks. -- Michael Fuhr ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Michael Fuhr <mike@fuhr.org> writes: > Updates to planner statistics appear to be visible in SERIALIZABLE > transactions even though updated data is not. Is this intentional? This is because the planner uses the catcache to read pg_statistic, and that runs on SnapshotNow rules. I am not excited about it: as a general rule you want to be using the latest stats available. It is much more likely that the stats existing at the start of your transaction are out-of-date than that they are a better representation of the data in your snapshot than stats committed a bit later. It's an approximation either way... (Actually, in general catalog changes apply immediately even in SERIALIZABLE mode. Try looking at a table that was ALTERed since your transaction began.) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| ||||
| Josh Berkus wrote: > Tom, > > >> Try looking at a table that was ALTERed since >> your transaction began.) >> > > Shouldn't the transaction fail, then? > > Neil Conway and I opened this VERY BIG can of worms a couple of years ago on hackers. I can't find the references to it in the archives. I think I need to go back more than the start of 2005. And I can't remember how to do that. It was in relation to CLUSTER on a table and how previous transactions would not use the correct view of the table. I recall something to do with issues relating to reading the catalog before transaction startup and problems with needing a transaction to use the catalog. My memory of it is fuzzy. If Neil or Tom remember the discussion they may be able to post a link. Sorry if this wasn't much help. Russell Smith ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| Thread Tools | |
| Display Modes | |
|
|