Unix Technical Forum

Synchronized scans

This is a discussion on Synchronized scans within the Pgsql Patches forums, part of the PostgreSQL category; --> Heikki Linnakangas <heikki@enterprisedb.com> writes: > I don't think anyone can reasonably expect to get the same ordering when > ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Patches

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #11 (permalink)  
Old 04-18-2008, 10:11 AM
Tom Lane
 
Posts: n/a
Default Re: Synchronized scans

Heikki Linnakangas <heikki@enterprisedb.com> writes:
> I don't think anyone can reasonably expect to get the same ordering when
> the same query issued twice in general, but within the same transaction
> it wouldn't be that unreasonable. If we care about that, we could keep
> track of starting locations per transaction, only do the synchronization
> on the first scan in a transaction, and start subsequent scans from the
> same page as the first one.


I think the real problem here is that the first scan is leaving state
behind that changes the behavior of the next scan. Which can have no
positive benefit, since obviously the first scan is not still
proceeding; the best you can hope for is that it's a no-op and the worst
case is that it actively pessimizes things. Why doesn't the patch
remove the shmem entry at scan termination?

> I think the warning on LIMIT without ORDER BY is a good idea, regardless
> of the synchronized scans patch.


I seriously doubt that can be done in any way that doesn't both warn
about perfectly-safe cases and fail to warn about other unsafe ones.
Furthermore, it's not uncommon for people to do "SELECT * ... LIMIT 1"
just to remind themselves of column names or whatever. Do we really
want the thing to be so nannyish? I was envisioning simply a stronger
warning in the SELECT reference page ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 04-18-2008, 10:11 AM
Heikki Linnakangas
 
Posts: n/a
Default Re: Synchronized scans

Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> I don't think anyone can reasonably expect to get the same ordering when
>> the same query issued twice in general, but within the same transaction
>> it wouldn't be that unreasonable. If we care about that, we could keep
>> track of starting locations per transaction, only do the synchronization
>> on the first scan in a transaction, and start subsequent scans from the
>> same page as the first one.

>
> I think the real problem here is that the first scan is leaving state
> behind that changes the behavior of the next scan. Which can have no
> positive benefit, since obviously the first scan is not still
> proceeding; the best you can hope for is that it's a no-op and the worst
> case is that it actively pessimizes things. Why doesn't the patch
> remove the shmem entry at scan termination?


Because there's no reason why it should, and it would require a lot more
bookkeeping. There can be many scanners on the same table, so we'd need
to implement some kind of reference counting, which means having to
reliably decrement the counter when a scan terminates.

In any case if there actually is a concurrent scan, you'd still see
different ordering. Removing the entry when a scan is over would just
make it harder to trigger.

>> I think the warning on LIMIT without ORDER BY is a good idea, regardless
>> of the synchronized scans patch.

>
> I seriously doubt that can be done in any way that doesn't both warn
> about perfectly-safe cases and fail to warn about other unsafe ones.
> Furthermore, it's not uncommon for people to do "SELECT * ... LIMIT 1"
> just to remind themselves of column names or whatever. Do we really
> want the thing to be so nannyish?


It really depends on how many false negatives and positives it gives. If
too many, it's just annoying, but if it's reasonably accurate I think it
would be OK to remind people running queries like that.

> I was envisioning simply a stronger warning in the SELECT reference page ...


I doubt the people that would be bitten by this read the SELECT
reference page.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #13 (permalink)  
Old 04-18-2008, 10:11 AM
Michael Glaesemann
 
Posts: n/a
Default Re: Synchronized scans


On Jun 4, 2007, at 15:24 , Heikki Linnakangas wrote:

> I don't think anyone can reasonably expect to get the same ordering
> when the same query issued twice in general, but within the same
> transaction it wouldn't be that unreasonable.


The order rows are returned without an ORDER BY clause *is*
implementation dependent, and is not guaranteed, at least by the
spec. Granted, LIMIT without ORDER BY (and DISTINCT for that matter)
brings this into sharp relief.

> I think the warning on LIMIT without ORDER BY is a good idea,
> regardless of the synchronized scans patch.


I'm not saying this isn't a good idea, but are there other places
where there might be gotchas for the unwary, such as DISTINCT without
ORDER BY or (for an unrelated example) UNION versus UNION ALL? How
many of these types of messages would be useful?

Michael Glaesemann
grzm seespotcode net



---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #14 (permalink)  
Old 04-18-2008, 10:12 AM
Jeff Davis
 
Posts: n/a
Default Re: Synchronized scans

On Mon, 2007-06-04 at 16:42 -0400, Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
> > I don't think anyone can reasonably expect to get the same ordering when
> > the same query issued twice in general, but within the same transaction
> > it wouldn't be that unreasonable. If we care about that, we could keep
> > track of starting locations per transaction, only do the synchronization
> > on the first scan in a transaction, and start subsequent scans from the
> > same page as the first one.

>
> I think the real problem here is that the first scan is leaving state
> behind that changes the behavior of the next scan. Which can have no
> positive benefit, since obviously the first scan is not still
> proceeding; the best you can hope for is that it's a no-op and the worst
> case is that it actively pessimizes things. Why doesn't the patch
> remove the shmem entry at scan termination?
>


Sounds like a reasonable idea to me. We could add the PID to the data
structure so that it would only remove the hint if it's the one that set
the hint. I think we'd just need to call a function to do that from
heap_endscan(), correct?

However, we couldn't 100% guarantee that the state would be cleared. A
backend could be killed in the middle of a scan.

The case you're worried about seems very narrow to me, and I think
"actively pessimizes" is too strong. Unless I misunderstand, "the best
you can hope for" no-op happens in all cases except a most bizarre one:
that in which you're executing repeated identical LIMIT queries with no
ORDER BY; and the tuples returned occupy more than 128K (16 pages is the
reporting period) but fewer would be effective to cache; and the table
in question is larger than the large table threshold. I'm just trying to
add some perspective about what we're fixing, here.

But it's fair to say that a scan should clear any state when it's done.

Regards,
Jeff Davis




---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #15 (permalink)  
Old 04-18-2008, 10:12 AM
Heikki Linnakangas
 
Posts: n/a
Default Re: Synchronized scans

Michael Glaesemann wrote:
>> I think the warning on LIMIT without ORDER BY is a good idea,
>> regardless of the synchronized scans patch.

>
> I'm not saying this isn't a good idea, but are there other places where
> there might be gotchas for the unwary, such as DISTINCT without ORDER BY
> or (for an unrelated example) UNION versus UNION ALL? How many of these
> types of messages would be useful?


LIMIT without ORDER BY is worse because it not only returns tuples in
different order, but it can return different tuples altogether when you
run it multiple times.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #16 (permalink)  
Old 04-18-2008, 10:12 AM
Jeff Davis
 
Posts: n/a
Default Re: Synchronized scans

On Mon, 2007-06-04 at 22:09 +0100, Heikki Linnakangas wrote:
> > I think the real problem here is that the first scan is leaving state
> > behind that changes the behavior of the next scan. Which can have no
> > positive benefit, since obviously the first scan is not still
> > proceeding; the best you can hope for is that it's a no-op and the worst
> > case is that it actively pessimizes things. Why doesn't the patch
> > remove the shmem entry at scan termination?

>
> Because there's no reason why it should, and it would require a lot more
> bookkeeping. There can be many scanners on the same table, so we'd need
> to implement some kind of reference counting, which means having to
> reliably decrement the counter when a scan terminates.
>


That's what I thought at first, and why I didn't do it. Right now I'm
thinking we could just add the PID to the hint, so that it would only
remove its own hint. Would that work?

It's still vulnerable to a backend being killed and the hint hanging
around. However, the next scan would clear get it back to normal.
Reference counting would cause weirdness if a backend died or something,
because it would never decrement to 0.

> In any case if there actually is a concurrent scan, you'd still see
> different ordering. Removing the entry when a scan is over would just
> make it harder to trigger.
>


Agreed. I don't know for sure whether that's good or bad, but it would
make the nondeterminism less immediately visible.

Regards,
Jeff Davis


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #17 (permalink)  
Old 04-18-2008, 10:12 AM
Heikki Linnakangas
 
Posts: n/a
Default Re: Synchronized scans

Jeff Davis wrote:
> On Mon, 2007-06-04 at 22:09 +0100, Heikki Linnakangas wrote:
>>> I think the real problem here is that the first scan is leaving state
>>> behind that changes the behavior of the next scan. Which can have no
>>> positive benefit, since obviously the first scan is not still
>>> proceeding; the best you can hope for is that it's a no-op and the worst
>>> case is that it actively pessimizes things. Why doesn't the patch
>>> remove the shmem entry at scan termination?

>> Because there's no reason why it should, and it would require a lot more
>> bookkeeping. There can be many scanners on the same table, so we'd need
>> to implement some kind of reference counting, which means having to
>> reliably decrement the counter when a scan terminates.
>>

>
> That's what I thought at first, and why I didn't do it. Right now I'm
> thinking we could just add the PID to the hint, so that it would only
> remove its own hint. Would that work?


Were you thinking of storing the PID of the backend that originally
created the hint, or updating the PID every time the hint is updated? In
any case, we still wouldn't know if there's other scanners still running.

We could just always remove the hint when a scan ends, and rely on the
fact that if there's other scans still running they will put the hint
back very quickly. There would then be a small window where there's no
hint but a scan is active, and a new scan starting during that window
would fail to synchronize with the other scanners.

> It's still vulnerable to a backend being killed and the hint hanging
> around. However, the next scan would clear get it back to normal.


Oh, did you mean that the PID would be updated whenever a new scan
starts? So that the PID stored would always be the PID of the latest
scanner. That might work pretty well, though a small scan with a LIMIT,
or any other situation where some scans run faster than others, might
clear the hint prematurely while other scans are still running.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #18 (permalink)  
Old 04-18-2008, 10:12 AM
Jeff Davis
 
Posts: n/a
Default Re: Synchronized scans

On Mon, 2007-06-04 at 22:57 +0100, Heikki Linnakangas wrote:
> > That's what I thought at first, and why I didn't do it. Right now I'm
> > thinking we could just add the PID to the hint, so that it would only
> > remove its own hint. Would that work?

>
> Were you thinking of storing the PID of the backend that originally
> created the hint, or updating the PID every time the hint is updated? In
> any case, we still wouldn't know if there's other scanners still running.
>


My thought was that every time the location was reported by a backend,
it would store 3 pieces of information, not 2:
* relfilenode
* the PID of the backend that created or updated this particular hint
last
* the location

Then, on heap_endscan() (if that's the right place), we find the hint,
and if the PID matches, we remove it. If not, it does nothing.

This would only matter when there weren't other scans. When concurrent
scans were happening, chances are the PID wouldn't match anyway, and
thus not be removed.

Regards,
Jeff Davis




---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #19 (permalink)  
Old 04-18-2008, 10:12 AM
Michael Glaesemann
 
Posts: n/a
Default Re: Synchronized scans


On Jun 4, 2007, at 16:34 , Heikki Linnakangas wrote:

> LIMIT without ORDER BY is worse because it not only returns tuples
> in different order, but it can return different tuples altogether
> when you run it multiple times.


Wouldn't DISTINCT ON suffer from the same issue without ORDER BY?

Michael Glaesemann
grzm seespotcode net



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #20 (permalink)  
Old 04-18-2008, 10:12 AM
Tom Lane
 
Posts: n/a
Default Re: Synchronized scans

Jeff Davis <pgsql@j-davis.com> writes:
> My thought was that every time the location was reported by a backend,
> it would store 3 pieces of information, not 2:
> * relfilenode
> * the PID of the backend that created or updated this particular hint
> last
> * the location


> Then, on heap_endscan() (if that's the right place), we find the hint,
> and if the PID matches, we remove it. If not, it does nothing.


> This would only matter when there weren't other scans. When concurrent
> scans were happening, chances are the PID wouldn't match anyway, and
> thus not be removed.


But note that barring backend crash, once all the scans are done it is
guaranteed that the hint will be removed --- somebody will be last to
update the hint, and therefore will remove it when they do heap_endscan,
even if others are not quite done. This is good in the sense that
later-starting backends won't be fooled into starting at what is
guaranteed to be the most pessimal spot, but it's got a downside too,
which is that there will be windows where seqscans are in process but
a newly started scan won't see them. Maybe that's a killer objection.

When exactly is the hint updated? I gathered from something Heikki said
that it's set after processing X amount of data, but I think it might be
better to set it *before* processing X amount of data. That is, the
hint means "I'm going to be scanning at least <threshold> blocks
starting here", not "I have scanned <threshold> blocks ending here",
which seems like the interpretation that's being used at the moment.
What that would mean is that successive "LIMIT 1000" calls would in fact
all start at the same place, barring interference from other backends.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 04:29 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com