Unix Technical Forum

Autovacuum and OldestXmin

This is a discussion on Autovacuum and OldestXmin within the pgsql Hackers forums, part of the PostgreSQL category; --> On Tue, 2007-11-27 at 09:25 +0100, mac_man2005@hotmail.it wrote: > Others optimizations, for example, can be done with the "virtual ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #31 (permalink)  
Old 04-15-2008, 10:34 PM
Simon Riggs
 
Posts: n/a
Default Re: Replacement Selection

On Tue, 2007-11-27 at 09:25 +0100, mac_man2005@hotmail.it wrote:

> Others optimizations, for example, can be done with the "virtual
> concatenation" technique:
> storing a cache of couples (first_element,last_element) for each created
> run. This
> could be useful in case we can find 2 couples (first_element_1,
> last_element_1) and
> (first_element_2, last_element_2) with last_element_1 <= first_element_2.
> In this case, those runs too can be seen as belonging to the same "logical
> run"
> (actually they are 2 RS different physical runs, or even 4 in 2WRS
> but can be seen as just one by mergesort). Of course, once those 2 (or 4)
> runs are
> logically merged into that only one, this last one in turn could be merged
> to other runs.
>
> What does all that imply? Mergesort would actually consider a smaller number
> of runs
> (since it should just work on logical runs). This means less jumps between
> runs on disk.


That's actually a refinement of an idea I've been working on for
optimizing sort. I'll post those separately.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #32 (permalink)  
Old 04-15-2008, 10:34 PM
mac_man2005@hotmail.it
 
Posts: n/a
Default Re: Replacement Selection

Any comment about Two Ways Replacement Selection (two heaps instead of just
one) ?


--------------------------------------------------
From: "Simon Riggs" <simon@2ndquadrant.com>
Sent: Tuesday, November 27, 2007 1:03 PM
To: <mac_man2005@hotmail.it>
Cc: <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Replacement Selection

> On Tue, 2007-11-27 at 09:25 +0100, mac_man2005@hotmail.it wrote:
>
>> Others optimizations, for example, can be done with the "virtual
>> concatenation" technique:
>> storing a cache of couples (first_element,last_element) for each created
>> run. This
>> could be useful in case we can find 2 couples (first_element_1,
>> last_element_1) and
>> (first_element_2, last_element_2) with last_element_1 <=
>> first_element_2.
>> In this case, those runs too can be seen as belonging to the same
>> "logical
>> run"
>> (actually they are 2 RS different physical runs, or even 4 in 2WRS
>> but can be seen as just one by mergesort). Of course, once those 2 (or 4)
>> runs are
>> logically merged into that only one, this last one in turn could be
>> merged
>> to other runs.
>>
>> What does all that imply? Mergesort would actually consider a smaller
>> number
>> of runs
>> (since it should just work on logical runs). This means less jumps
>> between
>> runs on disk.

>
> That's actually a refinement of an idea I've been working on for
> optimizing sort. I'll post those separately.
>
> --
> Simon Riggs
> 2ndQuadrant http://www.2ndQuadrant.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate
>


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #33 (permalink)  
Old 04-15-2008, 10:34 PM
Simon Riggs
 
Posts: n/a
Default Re: Replacement Selection

On Tue, 2007-11-27 at 17:49 +0100, mac_man2005@hotmail.it wrote:
> Any comment about Two Ways Replacement Selection (two heaps instead of just
> one) ?


It might allow dynamic heap size management more easily than with a
single heap.

If you really think it will be better, try it. You'll learn loads, right
or wrong. It's difficult to forecast ahead of time what's a good idea
and what's a bad idea. The real truth of these things is that you need
to pop the hood and start tinkering and its's quite hard to make a plan
for that. If you have a bad idea, just move on to the next one; they're
just ideas.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #34 (permalink)  
Old 04-15-2008, 10:35 PM
mac_man2005@hotmail.it
 
Posts: n/a
Default Re: Replacement Selection


> in puttuple_common(), the transition from an internal to external sort is
> performed at the bottom of the TSS_INITIAL case in the main switch
> statement.


The transition? Do we internal sort somewhere else and then external sort
here in tuplesort.c?

> The function dumptuples() heapifies the in-core tuples (divides the
> in-core tuples into initial runs and then advances the state to
> TSS_BUILDRUNS).


Cannot see where dumptuples() "advances the state to TSS_BUILDRUNS".
I expected something like
state->status = TSS_BUILDRUNS;
executed through dumptuples()



>
> I recommend you run the code in the debugger on a external-sorting query:
> watch two or three tuples go into the heap and you'll get the idea.
>
> The top of the heap is at state->memtuples[0] the heap goes down from
> there. New tuples are added there and the heap is adjusted (Using the
> tuplesort_heap_siftup() function).
>
> -Tim
>


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #35 (permalink)  
Old 04-15-2008, 10:35 PM
Gregory Stark
 
Posts: n/a
Default Re: Replacement Selection

<mac_man2005@hotmail.it> writes:

>> The function dumptuples() heapifies the in-core tuples (divides the in-core
>> tuples into initial runs and then advances the state to TSS_BUILDRUNS).

>
> Cannot see where dumptuples() "advances the state to TSS_BUILDRUNS".
> I expected something like
> state->status = TSS_BUILDRUNS;
> executed through dumptuples()


There's only one "state->status = TSS_BUILDRUNS" in the whole file. It's
called by inittapes which is called in one place, just before dumptuples.
Seriously, please try a bit harder before giving up.

The code in this file is quite interdependent which means you'll have to read
through the whole file (except perhaps the last section which just contains
the interface functions to feed different types of datums or tuples) to
understand any of it.

But it's quite self-contained which makes it one of the easier modules in the
system to get a functional grasp of. The hard part is understanding the
algorithm itself and working out the details of the array management.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #36 (permalink)  
Old 04-15-2008, 10:36 PM
Manolo _
 
Posts: n/a
Default Compiling PG on linux


I'm trying to compile PG on Ubuntu in order to hack tuplesort.c code
I just downloaded and unpacked the source code and red README and INSTALL files.

I'm going to

../configure --enable-debug --enable-cassert --enable-depend

then I would

make
make install

Can I improve something adding some missing option/command to the above steps?
Where and how to apply the TRACE_SORT option?
Any other useful options?

Sorry, I'm not so expert on Linux/PostgreSQL/gcc/make etc etc.

Thanks for your time.

----------------------------------------
> Date: Mon, 26 Nov 2007 11:09:54 -0800
> From: tkordas@greenplum.com
> To: mac_man2005@hotmail.it
> CC: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Replacement Selection
>
> mac_man2005@hotmail.it wrote:
>> I also implemented
>> Replacement Selection (RS) so if I'm able to integrate my RS I hope I
>> would be able to integrate the others too.

>
> The existing code implements RS. Tom asked you to describe what improvements
> you hope to make; I'm confident that he already understands how to implement
> RS. :-)
>
> **
>
> Why don't you compile with TRACE_SORT enabled and watch the log output.
>
> The function in tuplesort.c that you should start with is puttuple_common().
>
> in puttuple_common(), the transition from an internal to external sort is
> performed at the bottom of the TSS_INITIAL case in the main switch
> statement. The function dumptuples() heapifies the in-core tuples (divides
> the in-core tuples into initial runs and then advances the state to
> TSS_BUILDRUNS). All subsequent tuples will hit the TSS_BUILDRUNS case and
> will insert tuples into the heap; emitting tuples for the current run as it
> goes.
>
> I recommend you run the code in the debugger on a external-sorting query:
> watch two or three tuples go into the heap and you'll get the idea.
>
> The top of the heap is at state->memtuples[0] the heap goes down from there.
> New tuples are added there and the heap is adjusted (Using the
> tuplesort_heap_siftup() function).
>
> -Tim


__________________________________________________ _______________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/g...ave/direct/01/
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #37 (permalink)  
Old 04-15-2008, 10:36 PM
Joshua D. Drake
 
Posts: n/a
Default Re: Compiling PG on linux

Manolo _ wrote:
> I'm trying to compile PG on Ubuntu in order to hack tuplesort.c code
> I just downloaded and unpacked the source code and red README and INSTALL files.
>
> I'm going to
>
> ./configure --enable-debug --enable-cassert --enable-depend
>
> then I would
>
> make
> make install
>
> Can I improve something adding some missing option/command to the above steps?
> Where and how to apply the TRACE_SORT option?
> Any other useful options?


You don't want --enable-cassert on a production machine it is a
performance hit.

Joshua D. Drake




---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #38 (permalink)  
Old 04-15-2008, 10:36 PM
Alvaro Herrera
 
Posts: n/a
Default Re: Compiling PG on linux

Manolo _ wrote:
>
> I'm trying to compile PG on Ubuntu in order to hack tuplesort.c code
> I just downloaded and unpacked the source code and red README and INSTALL files.
>
> I'm going to
>
> ./configure --enable-debug --enable-cassert --enable-depend
>
> then I would
>
> make
> make install


> Can I improve something adding some missing option/command to the above steps?


Maybe you would want to change -O2 to -O0 in CFLAGS so that debugging is
easier (you will eventually need it).

> Where and how to apply the TRACE_SORT option?


Use pg_config_manual.h.

--
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"Si quieres ser creativo, aprende el arte de perder el tiempo"

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #39 (permalink)  
Old 04-15-2008, 10:36 PM
Andrew Dunstan
 
Posts: n/a
Default Re: Compiling PG on linux



Joshua D. Drake wrote:
> Manolo _ wrote:
>>
>> ./configure --enable-debug --enable-cassert --enable-depend
>>

>
> You don't want --enable-cassert on a production machine it is a
> performance hit.
>
>


He's clearly not setting up for production, but for development, where
cassert is quite appropriate.

cheers

andrew

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #40 (permalink)  
Old 04-15-2008, 10:46 PM
Bruce Momjian
 
Posts: n/a
Default Re: Autovacuum and OldestXmin


Added to TODO:

o Prevent autovacuum from running if an old transaction is still
running from the last vacuum

http://archives.postgresql.org/pgsql...1/msg00899.php


---------------------------------------------------------------------------

Christopher Browne wrote:
> The world rejoiced as alvherre@alvh.no-ip.org (Alvaro Herrera) wrote:
> > Simon Riggs wrote:
> >> I notice that slony records the oldestxmin that was running when it last
> >> ran a VACUUM on its tables. This allows slony to avoid running a VACUUM
> >> when it would be clearly pointless to do so.
> >>
> >> AFAICS autovacuum does not do this, or did I miss that?

> >
> > Hmm, I think it's just because nobody suggested it and I didn't came up
> > with the idea.
> >
> > Whether it's a useful thing to do is a different matter. Why store it
> > per table and not more widely? Perhaps per database would be just as
> > useful; and maybe it would allow us to skip running autovac workers
> > when there is no point in doing so.

>
> I think I need to take blame for that feature in Slony-I ;-).
>
> I imagine it might be useful to add it to autovac, too. I thought it
> was pretty neat that this could be successfully handled by comparison
> with a single value (e.g. - eldest xmin), and I expect that using a
> single quasi-global value should be good enough for autovac.
>
> If there is some elderly, long-running transaction that isn't a
> VACUUM, that will indeed inhibit VACUUM from doing any good, globally,
> across the cluster, until such time as that transaction ends.
>
> To, at that point, "inhibit" autovac from bothering to run VACUUM,
> would seem like a good move. There is still value to running ANALYZE
> on tables, so it doesn't warrant stopping autovac altogether, but this
> scenario suggests a case for suppressing futile vacuuming, at least...
> --
> If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me
> http://linuxfinances.info/info/slony.html
> It's hard to tell if someone is inconspicuous.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq


--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 12:20 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com