This is a discussion on Autovacuum and OldestXmin within the pgsql Hackers forums, part of the PostgreSQL category; --> On Tue, 2007-11-27 at 09:25 +0100, mac_man2005@hotmail.it wrote: > Others optimizations, for example, can be done with the "virtual ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| On Tue, 2007-11-27 at 09:25 +0100, mac_man2005@hotmail.it wrote: > Others optimizations, for example, can be done with the "virtual > concatenation" technique: > storing a cache of couples (first_element,last_element) for each created > run. This > could be useful in case we can find 2 couples (first_element_1, > last_element_1) and > (first_element_2, last_element_2) with last_element_1 <= first_element_2. > In this case, those runs too can be seen as belonging to the same "logical > run" > (actually they are 2 RS different physical runs, or even 4 in 2WRS > but can be seen as just one by mergesort). Of course, once those 2 (or 4) > runs are > logically merged into that only one, this last one in turn could be merged > to other runs. > > What does all that imply? Mergesort would actually consider a smaller number > of runs > (since it should just work on logical runs). This means less jumps between > runs on disk. That's actually a refinement of an idea I've been working on for optimizing sort. I'll post those separately. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| Any comment about Two Ways Replacement Selection (two heaps instead of just one) ? -------------------------------------------------- From: "Simon Riggs" <simon@2ndquadrant.com> Sent: Tuesday, November 27, 2007 1:03 PM To: <mac_man2005@hotmail.it> Cc: <pgsql-hackers@postgresql.org> Subject: Re: [HACKERS] Replacement Selection > On Tue, 2007-11-27 at 09:25 +0100, mac_man2005@hotmail.it wrote: > >> Others optimizations, for example, can be done with the "virtual >> concatenation" technique: >> storing a cache of couples (first_element,last_element) for each created >> run. This >> could be useful in case we can find 2 couples (first_element_1, >> last_element_1) and >> (first_element_2, last_element_2) with last_element_1 <= >> first_element_2. >> In this case, those runs too can be seen as belonging to the same >> "logical >> run" >> (actually they are 2 RS different physical runs, or even 4 in 2WRS >> but can be seen as just one by mergesort). Of course, once those 2 (or 4) >> runs are >> logically merged into that only one, this last one in turn could be >> merged >> to other runs. >> >> What does all that imply? Mergesort would actually consider a smaller >> number >> of runs >> (since it should just work on logical runs). This means less jumps >> between >> runs on disk. > > That's actually a refinement of an idea I've been working on for > optimizing sort. I'll post those separately. > > -- > Simon Riggs > 2ndQuadrant http://www.2ndQuadrant.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate > ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| On Tue, 2007-11-27 at 17:49 +0100, mac_man2005@hotmail.it wrote: > Any comment about Two Ways Replacement Selection (two heaps instead of just > one) ? It might allow dynamic heap size management more easily than with a single heap. If you really think it will be better, try it. You'll learn loads, right or wrong. It's difficult to forecast ahead of time what's a good idea and what's a bad idea. The real truth of these things is that you need to pop the hood and start tinkering and its's quite hard to make a plan for that. If you have a bad idea, just move on to the next one; they're just ideas. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| > in puttuple_common(), the transition from an internal to external sort is > performed at the bottom of the TSS_INITIAL case in the main switch > statement. The transition? Do we internal sort somewhere else and then external sort here in tuplesort.c? > The function dumptuples() heapifies the in-core tuples (divides the > in-core tuples into initial runs and then advances the state to > TSS_BUILDRUNS). Cannot see where dumptuples() "advances the state to TSS_BUILDRUNS". I expected something like state->status = TSS_BUILDRUNS; executed through dumptuples() > > I recommend you run the code in the debugger on a external-sorting query: > watch two or three tuples go into the heap and you'll get the idea. > > The top of the heap is at state->memtuples[0] the heap goes down from > there. New tuples are added there and the heap is adjusted (Using the > tuplesort_heap_siftup() function). > > -Tim > ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| <mac_man2005@hotmail.it> writes: >> The function dumptuples() heapifies the in-core tuples (divides the in-core >> tuples into initial runs and then advances the state to TSS_BUILDRUNS). > > Cannot see where dumptuples() "advances the state to TSS_BUILDRUNS". > I expected something like > state->status = TSS_BUILDRUNS; > executed through dumptuples() There's only one "state->status = TSS_BUILDRUNS" in the whole file. It's called by inittapes which is called in one place, just before dumptuples. Seriously, please try a bit harder before giving up. The code in this file is quite interdependent which means you'll have to read through the whole file (except perhaps the last section which just contains the interface functions to feed different types of datums or tuples) to understand any of it. But it's quite self-contained which makes it one of the easier modules in the system to get a functional grasp of. The hard part is understanding the algorithm itself and working out the details of the array management. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| I'm trying to compile PG on Ubuntu in order to hack tuplesort.c code I just downloaded and unpacked the source code and red README and INSTALL files. I'm going to ../configure --enable-debug --enable-cassert --enable-depend then I would make make install Can I improve something adding some missing option/command to the above steps? Where and how to apply the TRACE_SORT option? Any other useful options? Sorry, I'm not so expert on Linux/PostgreSQL/gcc/make etc etc. Thanks for your time. ---------------------------------------- > Date: Mon, 26 Nov 2007 11:09:54 -0800 > From: tkordas@greenplum.com > To: mac_man2005@hotmail.it > CC: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Replacement Selection > > mac_man2005@hotmail.it wrote: >> I also implemented >> Replacement Selection (RS) so if I'm able to integrate my RS I hope I >> would be able to integrate the others too. > > The existing code implements RS. Tom asked you to describe what improvements > you hope to make; I'm confident that he already understands how to implement > RS. :-) > > ** > > Why don't you compile with TRACE_SORT enabled and watch the log output. > > The function in tuplesort.c that you should start with is puttuple_common(). > > in puttuple_common(), the transition from an internal to external sort is > performed at the bottom of the TSS_INITIAL case in the main switch > statement. The function dumptuples() heapifies the in-core tuples (divides > the in-core tuples into initial runs and then advances the state to > TSS_BUILDRUNS). All subsequent tuples will hit the TSS_BUILDRUNS case and > will insert tuples into the heap; emitting tuples for the current run as it > goes. > > I recommend you run the code in the debugger on a external-sorting query: > watch two or three tuples go into the heap and you'll get the idea. > > The top of the heap is at state->memtuples[0] the heap goes down from there. > New tuples are added there and the heap is adjusted (Using the > tuplesort_heap_siftup() function). > > -Tim __________________________________________________ _______________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/g...ave/direct/01/ ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Manolo _ wrote: > I'm trying to compile PG on Ubuntu in order to hack tuplesort.c code > I just downloaded and unpacked the source code and red README and INSTALL files. > > I'm going to > > ./configure --enable-debug --enable-cassert --enable-depend > > then I would > > make > make install > > Can I improve something adding some missing option/command to the above steps? > Where and how to apply the TRACE_SORT option? > Any other useful options? You don't want --enable-cassert on a production machine it is a performance hit. Joshua D. Drake ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Manolo _ wrote: > > I'm trying to compile PG on Ubuntu in order to hack tuplesort.c code > I just downloaded and unpacked the source code and red README and INSTALL files. > > I'm going to > > ./configure --enable-debug --enable-cassert --enable-depend > > then I would > > make > make install > Can I improve something adding some missing option/command to the above steps? Maybe you would want to change -O2 to -O0 in CFLAGS so that debugging is easier (you will eventually need it). > Where and how to apply the TRACE_SORT option? Use pg_config_manual.h. -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J "Si quieres ser creativo, aprende el arte de perder el tiempo" ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| Joshua D. Drake wrote: > Manolo _ wrote: >> >> ./configure --enable-debug --enable-cassert --enable-depend >> > > You don't want --enable-cassert on a production machine it is a > performance hit. > > He's clearly not setting up for production, but for development, where cassert is quite appropriate. cheers andrew ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| ||||
| Added to TODO: o Prevent autovacuum from running if an old transaction is still running from the last vacuum http://archives.postgresql.org/pgsql...1/msg00899.php --------------------------------------------------------------------------- Christopher Browne wrote: > The world rejoiced as alvherre@alvh.no-ip.org (Alvaro Herrera) wrote: > > Simon Riggs wrote: > >> I notice that slony records the oldestxmin that was running when it last > >> ran a VACUUM on its tables. This allows slony to avoid running a VACUUM > >> when it would be clearly pointless to do so. > >> > >> AFAICS autovacuum does not do this, or did I miss that? > > > > Hmm, I think it's just because nobody suggested it and I didn't came up > > with the idea. > > > > Whether it's a useful thing to do is a different matter. Why store it > > per table and not more widely? Perhaps per database would be just as > > useful; and maybe it would allow us to skip running autovac workers > > when there is no point in doing so. > > I think I need to take blame for that feature in Slony-I ;-). > > I imagine it might be useful to add it to autovac, too. I thought it > was pretty neat that this could be successfully handled by comparison > with a single value (e.g. - eldest xmin), and I expect that using a > single quasi-global value should be good enough for autovac. > > If there is some elderly, long-running transaction that isn't a > VACUUM, that will indeed inhibit VACUUM from doing any good, globally, > across the cluster, until such time as that transaction ends. > > To, at that point, "inhibit" autovac from bothering to run VACUUM, > would seem like a good move. There is still value to running ANALYZE > on tables, so it doesn't warrant stopping autovac altogether, but this > scenario suggests a case for suppressing futile vacuuming, at least... > -- > If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me > http://linuxfinances.info/info/slony.html > It's hard to tell if someone is inconspicuous. > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |