vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I'm now done with this patch and testing it. I fixed a little off-by-one in "backward scan, not inited" branch, but I was unable to test it. It seems that code is actually never used because that case is optimized to a rewind in the executor. I marked those seemingly unreachable places in the code with a comment. I didn't touch the large scan threshold of NBuffers / 4 Tom that committed as part of the buffer ring patch. IOW I removed the GUC variable from the patch. I think the jury is still out there on this one. I included a basic regression test as well. It creates a ~10MB table, which with the default 32MB shared_buffers setting is large enough that synchronized scans are used. It then runs a query with a LIMIT so that it scans ~1/2 of a table. and then runs a new seqscan and checks that it returns rows from the second half of the table. This is a bit flakey, as the needed table size depends on the large scan threshold, and we can't test the actual concurrent behavior, but it's better than nothing. 10 MB works for "make check", but isn't enough if one runs "installcheck" against an existing installation with a larger shared_buffers. I therefore only added the test case to the parallel_schedule, though it still breaks "installcheck-parallel". If we later add the GUC variable, that should be used in the test case. For the record, this patch has a small negative impact on scans like "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS HEAD the first 1000 rows will stay in buffer cache, but with the patch each scan will start from roughly where previous one stopped, requiring more pages to be read from disk each time. I don't think it's something to worry about in practice, but I thought I'd mention it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Heikki Linnakangas <heikki@enterprisedb.com> writes: > For the record, this patch has a small negative impact on scans like > "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS > HEAD the first 1000 rows will stay in buffer cache, but with the patch > each scan will start from roughly where previous one stopped, requiring > more pages to be read from disk each time. I don't think it's something > to worry about in practice, but I thought I'd mention it. Urgh. The answers change depending on (more or less) the phase of the moon? I've got a serious problem with that. You might look back to 1997 when GEQO very nearly got tossed out entirely because it destroyed reproducibility of query results. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Tom Lane wrote: > Heikki Linnakangas <heikki@enterprisedb.com> writes: >> For the record, this patch has a small negative impact on scans like >> "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS >> HEAD the first 1000 rows will stay in buffer cache, but with the patch >> each scan will start from roughly where previous one stopped, requiring >> more pages to be read from disk each time. I don't think it's something >> to worry about in practice, but I thought I'd mention it. > > Urgh. The answers change depending on (more or less) the phase of the > moon? I've got a serious problem with that. You might look back to > 1997 when GEQO very nearly got tossed out entirely because it destroyed > reproducibility of query results. That's a very fundamental result of this patch, unfortunately. It only happens on scans on tables larger than the threshold. And because we only report the current scan location every 128KB, if you repeat the same SELECT .. LIMIT X query with no other scanners on that table, you'll get the same results as long as X is smaller than 128KB. I thought we've been through this issue already... -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Heikki Linnakangas wrote: > Tom Lane wrote: > > Heikki Linnakangas <heikki@enterprisedb.com> writes: > >> For the record, this patch has a small negative impact on scans like > >> "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS > >> HEAD the first 1000 rows will stay in buffer cache, but with the patch > >> each scan will start from roughly where previous one stopped, requiring > >> more pages to be read from disk each time. I don't think it's something > >> to worry about in practice, but I thought I'd mention it. > > > > Urgh. The answers change depending on (more or less) the phase of the > > moon? I've got a serious problem with that. You might look back to > > 1997 when GEQO very nearly got tossed out entirely because it destroyed > > reproducibility of query results. > > That's a very fundamental result of this patch, unfortunately. It only > happens on scans on tables larger than the threshold. And because we > only report the current scan location every 128KB, if you repeat the > same SELECT .. LIMIT X query with no other scanners on that table, > you'll get the same results as long as X is smaller than 128KB. > > I thought we've been through this issue already... Agreed. I thought we always said that a LIMIT without an ORDER BY was meaningless, particuarly because an intervening UPDATE could have moved rows to another place in the table. In fact, at one time we considered prevening LIMIT without ORDER BY because it was meaningless, but decided if people want unstable results, they should be able to get them. An argument could be made that a LIMIT without ORDER BY on a table locked read-only should be stable. As I understand it, the problem is that while currently LIMIT without ORDER BY always starts at the beginning of the table, it will not with this patch. I consider that acceptable. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Bruce Momjian <bruce@momjian.us> writes: > As I understand it, the problem is that while currently LIMIT without > ORDER BY always starts at the beginning of the table, it will not with > this patch. I consider that acceptable. It's definitely going to require stronger warnings than we have now about using LIMIT without ORDER BY. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: >> As I understand it, the problem is that while currently LIMIT without >> ORDER BY always starts at the beginning of the table, it will not with >> this patch. I consider that acceptable. > > It's definitely going to require stronger warnings than we have now > about using LIMIT without ORDER BY. Along the lines of NOTICE: LIMIT without ORDER BY returns an arbitrary set of matching rows perhaps? I wonder how easy it is to detect that in the planner. Or just a remark in the manual? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| On Mon, 2007-06-04 at 10:53 +0100, Heikki Linnakangas wrote: > I'm now done with this patch and testing it. > Great! > For the record, this patch has a small negative impact on scans like > "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS > HEAD the first 1000 rows will stay in buffer cache, but with the patch > each scan will start from roughly where previous one stopped, requiring > more pages to be read from disk each time. I don't think it's something > to worry about in practice, but I thought I'd mention it. > No surprise here, as you and Bruce have already pointed out. If we wanted to reduce the occurrence of this phenomena, we could perhaps "time out" the hints so that it's impossible to pick up a hint from a scan that finished 5 minutes ago. It doesn't seem helpful to further obscure the non-determinism of the results, however. Regards, Jeff Davis ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Tom Lane wrote: > Heikki Linnakangas <heikki@enterprisedb.com> writes: > > For the record, this patch has a small negative impact on scans like > > "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS > > HEAD the first 1000 rows will stay in buffer cache, but with the patch > > each scan will start from roughly where previous one stopped, requiring > > more pages to be read from disk each time. I don't think it's something > > to worry about in practice, but I thought I'd mention it. > > Urgh. The answers change depending on (more or less) the phase of the > moon? I've got a serious problem with that. You might look back to > 1997 when GEQO very nearly got tossed out entirely because it destroyed > reproducibility of query results. What about the simple idea of just disabling the use of a sync scan when the query has LIMIT and no ORDER BY, and start always at block 0 in that case? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Alvaro Herrera wrote: > Tom Lane wrote: >> Heikki Linnakangas <heikki@enterprisedb.com> writes: >>> For the record, this patch has a small negative impact on scans like >>> "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS >>> HEAD the first 1000 rows will stay in buffer cache, but with the patch >>> each scan will start from roughly where previous one stopped, requiring >>> more pages to be read from disk each time. I don't think it's something >>> to worry about in practice, but I thought I'd mention it. >> Urgh. The answers change depending on (more or less) the phase of the >> moon? I've got a serious problem with that. You might look back to >> 1997 when GEQO very nearly got tossed out entirely because it destroyed >> reproducibility of query results. > > What about the simple idea of just disabling the use of a sync scan when > the query has LIMIT and no ORDER BY, and start always at block 0 in that > case? That handles the LIMIT case, but you would still observe the different ordering. And some people do LIMIT-like behavior in client side, by opening a cursor and only fetching first n rows. I don't think anyone can reasonably expect to get the same ordering when the same query issued twice in general, but within the same transaction it wouldn't be that unreasonable. If we care about that, we could keep track of starting locations per transaction, only do the synchronization on the first scan in a transaction, and start subsequent scans from the same page as the first one. That way if you issue the same query twice in a transaction, or do something like: BEGIN; SELECT * FROM queue FOR UPDATE LIMIT 10 do stuff.. DELETE FROM queue LIMIT 10 COMMIT; you'd get the expected result. I think the warning on LIMIT without ORDER BY is a good idea, regardless of the synchronized scans patch. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| ||||
| Jeff Davis wrote: > No surprise here, as you and Bruce have already pointed out. > > If we wanted to reduce the occurrence of this phenomena, we could > perhaps "time out" the hints so that it's impossible to pick up a hint > from a scan that finished 5 minutes ago. > > It doesn't seem helpful to further obscure the non-determinism of the > results, however. I agree it's probably not a good idea to try masking this. It'll just make it harder to hit the issue in testing, so that you run into it in production. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |