Unix Technical Forum

Improvement of procArray.xmin for VACUUM

This is a discussion on Improvement of procArray.xmin for VACUUM within the Pgsql Patches forums, part of the PostgreSQL category; --> VACUUM treats dead tuple with an xmax less than all procArray.xmin's as invisible with space ready to be reused. ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Patches

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-18-2008, 09:45 AM
Bruce Momjian
 
Posts: n/a
Default Improvement of procArray.xmin for VACUUM

VACUUM treats dead tuple with an xmax less than all procArray.xmin's as
invisible with space ready to be reused.

Right now, we set procArray.xmin at transaction start to the required
SERIALIZABLE value. We currently don't update procArray.xmin when we
know we are in a READ COMMITTED transaction, even though we already call
GetSnapshotData().

This means that if a transaction completes that was active when our
transaction started, we don't update our procArray.xmin during the next
multi-statement transaction command to indicate that we don't care about
the completed transaction anymore.

I have been thinking we could improve how quickly VACUUM can expire rows
if we update procArray.xmin more frequently for non-SERIALIZABLE
transactions.

The attached patch updates procArray.xmin in this manner. Here is an
example of how the patch improves dead row reuse:

Session #:
1 2 3

CREATE TABLE test(x int);
INSERT INTO test VALUES (1);
BEGIN;
DELETE FROM test;
BEGIN;
SELECT 1;
COMMIT;
VACUUM VERBOSE test;
(row not reused)
SELECT 1;

(At this point #2 doesn't see
the test row anymore. Patch
updates procArray.xmin.)

VACUUM VERBOSE test;
(row reused with patch)
COMMIT;
VACUUM VERBOSE test;
(normal row reuse)

What the patch does is to recompute procArray.xmin during the second
SELECT, which is then used by the next VACUUM.

The patch has debug code that prints the procArray.xmin assignments. I
am not sure if I have all the boolean calls to GetTransactionSnapshot()
correct. The set_procArray_xmin parameter should be true only when we
are sure we aren't going to be resetting the session back to an earlier
snapshot.

The major affect of the patch is to set the minimum procArray.xmin to be
the earliest in-process transaction, rather than the earliest in-process
transaction at transaction start.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-18-2008, 09:45 AM
Alvaro Herrera
 
Posts: n/a
Default Re: Improvement of procArray.xmin for VACUUM

Bruce Momjian wrote:

> The attached patch updates procArray.xmin in this manner. Here is an
> example of how the patch improves dead row reuse:


I don't think this really works. Consider what happens if I change
session 2 this way:

> Session #:
> 1 2 3
>
> CREATE TABLE test(x int);
> INSERT INTO test VALUES (1);
> BEGIN;
> DELETE FROM test;
> BEGIN;

DECLARE foo CURSOR FOR
SELECT * FROM test;
> SELECT 1;
> COMMIT;
> VACUUM VERBOSE test;
> (row not reused)
> SELECT 1;

FETCH * FROM foo;
>
> (At this point #2 doesn't see
> the test row anymore. Patch
> updates procArray.xmin.)
>
> VACUUM VERBOSE test;
> (row reused with patch)
> COMMIT;
> VACUUM VERBOSE test;
> (normal row reuse)



--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-18-2008, 09:45 AM
Tom Lane
 
Posts: n/a
Default Re: Improvement of procArray.xmin for VACUUM

Bruce Momjian <bruce@momjian.us> writes:
> I have been thinking we could improve how quickly VACUUM can expire rows
> if we update procArray.xmin more frequently for non-SERIALIZABLE
> transactions.
> The attached patch updates procArray.xmin in this manner.


This patch is incredibly broken. Didn't you understand what I said
about how we don't track which snapshots are still alive? You can't
advance procArray.xmin past the xmin of the oldest live snapshot in the
backend, and you can't assume that there are no live snapshots at the
places where this patch assumes that. (Open cursors are one obvious
counterexample, but I think there are more.)

To make intra-transaction advancing of xmin possible, we'd need to
explicitly track all of the backend's live snapshots, probably by
introducing a "snapshot cache" manager that gives out tracked refcounts
as we do for some other structures like catcache entries. This might
have some other advantages (I think most of the current CopySnapshot
operations could be replaced by refcount increments) but it's a whole
lot more complicated and invasive than what you've got here.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-18-2008, 09:45 AM
Bruce Momjian
 
Posts: n/a
Default Re: Improvement of procArray.xmin for VACUUM

Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > I have been thinking we could improve how quickly VACUUM can expire rows
> > if we update procArray.xmin more frequently for non-SERIALIZABLE
> > transactions.
> > The attached patch updates procArray.xmin in this manner.

>
> This patch is incredibly broken. Didn't you understand what I said
> about how we don't track which snapshots are still alive? You can't
> advance procArray.xmin past the xmin of the oldest live snapshot in the
> backend, and you can't assume that there are no live snapshots at the
> places where this patch assumes that. (Open cursors are one obvious
> counterexample, but I think there are more.)
>
> To make intra-transaction advancing of xmin possible, we'd need to
> explicitly track all of the backend's live snapshots, probably by
> introducing a "snapshot cache" manager that gives out tracked refcounts
> as we do for some other structures like catcache entries. This might
> have some other advantages (I think most of the current CopySnapshot
> operations could be replaced by refcount increments) but it's a whole
> lot more complicated and invasive than what you've got here.


I updated the patch to save the MyProc->xid at the time the first cursor
is created, and not allow the MyProc->xid to be set lower than that
saved value in the current transaction. It added only a few more lines
to the patch.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-18-2008, 09:45 AM
Heikki Linnakangas
 
Posts: n/a
Default Re: Improvement of procArray.xmin for VACUUM

Bruce Momjian wrote:
> Tom Lane wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>>> I have been thinking we could improve how quickly VACUUM can expire rows
>>> if we update procArray.xmin more frequently for non-SERIALIZABLE
>>> transactions.
>>> The attached patch updates procArray.xmin in this manner.

>> This patch is incredibly broken. Didn't you understand what I said
>> about how we don't track which snapshots are still alive? You can't
>> advance procArray.xmin past the xmin of the oldest live snapshot in the
>> backend, and you can't assume that there are no live snapshots at the
>> places where this patch assumes that. (Open cursors are one obvious
>> counterexample, but I think there are more.)
>>
>> To make intra-transaction advancing of xmin possible, we'd need to
>> explicitly track all of the backend's live snapshots, probably by
>> introducing a "snapshot cache" manager that gives out tracked refcounts
>> as we do for some other structures like catcache entries. This might
>> have some other advantages (I think most of the current CopySnapshot
>> operations could be replaced by refcount increments) but it's a whole
>> lot more complicated and invasive than what you've got here.

>
> I updated the patch to save the MyProc->xid at the time the first cursor
> is created, and not allow the MyProc->xid to be set lower than that
> saved value in the current transaction. It added only a few more lines
> to the patch.


It seems to me a lot cleaner to do the reference counting like Tom
suggested. Increase the refcount on CopySnapshot, and decrease it on
FreeSnapshot. Assuming that all callers of CopySnapshot free the
snapshot with FreeSnapshot when they're done with it.

BTW: I really like the idea of doing this. It's a relatively simple
thing to do and gives some immediate benefit. And it opens the door for
more tricks to vacuum more aggressively in the presence of long-running
transactions. And it allows us to vacuum tuples that were inserted and
deleted in the same transactions even while the transaction is still
running, which helps some pathological cases where a transaction updates
a counter column many times within a transaction. We could also use it
to free entries in the new combocids hash table earlier (not that it's a
problem as it is, though).

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-18-2008, 09:45 AM
Tom Lane
 
Posts: n/a
Default Re: Improvement of procArray.xmin for VACUUM

Heikki Linnakangas <heikki@enterprisedb.com> writes:
> It seems to me a lot cleaner to do the reference counting like Tom
> suggested. Increase the refcount on CopySnapshot, and decrease it on
> FreeSnapshot. Assuming that all callers of CopySnapshot free the
> snapshot with FreeSnapshot when they're done with it.


I don't believe we bother at the moment; which is one of the reasons
it'd be a nontrivial patch. I do think it might be worth doing though.
In the simple case where you're just issuing successive non-cursor
commands within a READ COMMITTED transaction, a refcounted
implementation would be able to recognize that there are *no* live
snapshots between commands and therefore reset MyProc->xmin to 0
whenever the backend is idle.

OTOH, do we have any evidence that this is worth bothering with at all?
I fear that the cases of long-running transactions that are problems
in the real world wouldn't be helped much --- for instance, pg_dump
wouldn't change behavior because it uses a serializable transaction.
Also, at some point a long-running transaction becomes a bottleneck
simply because its XID is itself the oldest thing visible in the
ProcArray and is determining everyone's xmin. How much daylight is
there really between "your xmin is old" and "your xid is old"?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-18-2008, 09:45 AM
Bruce Momjian
 
Posts: n/a
Default Re: Improvement of procArray.xmin for VACUUM

Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
> > It seems to me a lot cleaner to do the reference counting like Tom
> > suggested. Increase the refcount on CopySnapshot, and decrease it on
> > FreeSnapshot. Assuming that all callers of CopySnapshot free the
> > snapshot with FreeSnapshot when they're done with it.

>
> I don't believe we bother at the moment; which is one of the reasons
> it'd be a nontrivial patch. I do think it might be worth doing though.
> In the simple case where you're just issuing successive non-cursor
> commands within a READ COMMITTED transaction, a refcounted
> implementation would be able to recognize that there are *no* live
> snapshots between commands and therefore reset MyProc->xmin to 0
> whenever the backend is idle.


Attached is my current version of the patch. It doesn't work now that I
tried to do reference count for Snapshots, but will stop now that Tom is
considering redesigning the snapshot mechanism.

> OTOH, do we have any evidence that this is worth bothering with at all?
> I fear that the cases of long-running transactions that are problems
> in the real world wouldn't be helped much --- for instance, pg_dump
> wouldn't change behavior because it uses a serializable transaction.
> Also, at some point a long-running transaction becomes a bottleneck
> simply because its XID is itself the oldest thing visible in the
> ProcArray and is determining everyone's xmin. How much daylight is
> there really between "your xmin is old" and "your xid is old"?


Well, interesting you mention that, because I have a second idea on how
to improve things. We start with MyProc->xmin equal to our own xid, and
then look for earlier transactions. It should be possible to skip
considering our own xid for MyProc->xmin. This would obviously help
VACUUM during long-running transactions. While our transaction is
running, our xid isn't committed, so VACUUM isn't going to touch any of
our rows, and if other transactions complete before our
multi-transaction _statement_ starts, we can't see deleted rows from
them transaction, so why keep the deleted rows around? Consider this
case:

Session #:
1 2 3
BEGIN;
SELECT 1;
CREATE TABLE test(x int);
INSERT INTO test VALUES (1);
DELETE FROM test;
SELECT 1;
VACUUM VERBOSE test;
(row can be reused)
COMMIT;
VACUUM VERBOSE test;
(normal row reuse)

As I understand it, in READ COMMITTED mode, we have to skip
transactions in progress when our _statement_ starts, but anything
committed before that we see and we don't see dead rows created by them.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-18-2008, 09:45 AM
Gregory Stark
 
Posts: n/a
Default Re: Improvement of procArray.xmin for VACUUM

> OTOH, do we have any evidence that this is worth bothering with at all?
> I fear that the cases of long-running transactions that are problems
> in the real world wouldn't be helped much --- for instance, pg_dump
> wouldn't change behavior because it uses a serializable transaction.

Well I think this would be the same infrastructure we would need to do the
other discussed improvement to address pg_dump's impact. That would require us
to publish the youngest xmax of the live snapshots. Vacuum could deduce that
that xid cannot possibly see any transactions between the youngest extant xmax
and the oldest in-progress transaction.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-18-2008, 09:45 AM
Tom Lane
 
Posts: n/a
Default Re: Improvement of procArray.xmin for VACUUM

Gregory Stark <stark@enterprisedb.com> writes:
> Well I think this would be the same infrastructure we would need to do
> the other discussed improvement to address pg_dump's impact. That
> would require us to publish the youngest xmax of the live
> snapshots. Vacuum could deduce that that xid cannot possibly see any
> transactions between the youngest extant xmax and the oldest
> in-progress transaction.


.... and do what with the knowledge? Not remove tuples, because any such
tuples would be part of RECENTLY_DEAD update chains that that xact might
be following now or in the future.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-18-2008, 09:45 AM
Gregory Stark
 
Posts: n/a
Default Re: Improvement of procArray.xmin for VACUUM

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> Well I think this would be the same infrastructure we would need to do
>> the other discussed improvement to address pg_dump's impact. That
>> would require us to publish the youngest xmax of the live
>> snapshots. Vacuum could deduce that that xid cannot possibly see any
>> transactions between the youngest extant xmax and the oldest
>> in-progress transaction.

>
> ... and do what with the knowledge? Not remove tuples, because any such
> tuples would be part of RECENTLY_DEAD update chains that that xact might
> be following now or in the future.


Well that just means it might require extra work, not that it would be
impossible.

Firstly, some tuples would not be part of a chain and could be cleaned up
anyways. Others would be part of a HOT chain which might make it easier to
clean them up.

But even for tuples that are part of a chain there may be solutions. We could
truncate the tuple to just the MVCC information so subsequent transactions can
find the head. Or we might be able to go back and edit the forward link to
skip the dead intermediate tuples (and somehow deal with the race
conditions...)

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 10:55 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com