This is a discussion on Re: Which qsort is used within the pgsql Hackers forums, part of the PostgreSQL category; --> On Mon, 12 Dec 2005, Luke Lonergan wrote: > > Might you have time to implement these within the ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| On Mon, 12 Dec 2005, Luke Lonergan wrote: > > Might you have time to implement these within the testing framework I > published previously? It has both the NetBSD and qsortG included along with > a timing routine, etc. > Here we go: http://www.cs.toronto.edu/~zhouqq/po...sort/sort.html The source tar ball and linux 2.4G gcc 2.96 test results is on the page. There is a clear loser glibc, not sure qsortB or qsortG which is better. Regards, Qingqing ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| Qingqing, On 12/13/05 10:28 AM, "Qingqing Zhou" <zhouqq@cs.toronto.edu> wrote: > http://www.cs.toronto.edu/~zhouqq/po...sort/sort.html > > The source tar ball and linux 2.4G gcc 2.96 test results is on the page. > There is a clear loser glibc, not sure qsortB or qsortG which is better. Great stuff - thanks for doing this. From the results, it's clear that the scale test makes a huge difference in the relative performance. I'm wondering if it's an L2 cache effect, as it seems to occur in that range. Overall - I'd say that the BSD routine is showing the best overall results when the scale test is included. The qsortG routine has some significantly better performance in certain cases at smaller sort set sizes - it could probably be improved for better L2 use, but BSD is already there. Based on this it seems like we should expose the option to choose the BSD qsort routine at configure time. - Luke ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| On Wed, 14 Dec 2005, Luke Lonergan wrote: > > Overall - I'd say that the BSD routine is showing the best overall results > when the scale test is included. The qsortG routine has some significantly > better performance in certain cases at smaller sort set sizes - it could > probably be improved for better L2 use, but BSD is already there. > > Based on this it seems like we should expose the option to choose the BSD > qsort routine at configure time. > Before we pin down this, I hope more extensive tests on various platforms could be done. So we could give some suggestions when we should enable the "--enable-bsdqsort" option. I can post a result on a SunOS machine (but the problem is that many ppl share this machine) and a windows machine. Regards, Qingqing ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| On Thu, Dec 15, 2005 at 12:10:37AM -0500, Qingqing Zhou wrote: > > On Wed, 14 Dec 2005, Luke Lonergan wrote: > > > > Overall - I'd say that the BSD routine is showing the best overall results > > when the scale test is included. The qsortG routine has some significantly > > better performance in certain cases at smaller sort set sizes - it could > > probably be improved for better L2 use, but BSD is already there. > > > > Based on this it seems like we should expose the option to choose the BSD > > qsort routine at configure time. > > > > Before we pin down this, I hope more extensive tests on various platforms > could be done. So we could give some suggestions when we should enable the > "--enable-bsdqsort" option. I can post a result on a SunOS machine (but > the problem is that many ppl share this machine) and a windows machine. I have access to both some (SLOW) ultra5's and a machine running opensolaris on AMD if testing there would help. I'll need a pointer to a patch and test-case though... Oh, I also have access to an old SGI... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| On Thu, 15 Dec 2005, Jim C. Nasby wrote: > > I have access to both some (SLOW) ultra5's and a machine running > opensolaris on AMD if testing there would help. I'll need a pointer to a > patch and test-case though... > Thanks! I've patched the program with the following changes: (1) add gcc-mingw support; (2) move the check_sort() out of do_sort() - the previous program is actually measuring the time of qsort plus result verification. Though that one "fairly" add equal cost to every competitor, which will not affect the confidence of the result, it is a defeat, sorry about that; The new results with SunOS and Windows tests are published at the same place: http://www.cs.toronto.edu/~zhouqq/po...sort/sort.html As Luke suggested, BSD seems a good choice for scalable and stable consideration. But I also sent an email to the author of qsortG, and he might take a look at the small-range performance problem during the holiday. So if he can help that, then we will have another candidate. By the way, I do spend some time on fighting the win32 gettimeofday() emulation. I would suggest adding a comment like "don't use this method in windows to get high precision time, use elapsed_time() instead" ... Regards, Qingqing ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| > > > Based on this it seems like we should expose the option to choose the BSD > > > qsort routine at configure time. I have a related question. qsort is only used in the postgres source in a few places. If postgres used an internal implementation instead of the library source it could have implementations that don't use function pointers. This might perform faster for a few reasons. The comparator is much more likely to be eligible for inlining for one. It also opens the door to using different sort algorithms for different applications. There may be some cases where the input is never sorted and the sample size is small so qsort is a good choice, and others where the inputs can be large and using a better algorithm with worse overhead like mergesort might make more sense. Unfortunately there isn't just a single qsort call though. I count 6 comparators in the source tree I have. So perhaps this is a non-starter. Having 6 qsort implementations around sounds pretty sketchy. -- greg ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Greg Stark <gsstark@mit.edu> writes: > I have a related question. qsort is only used in the postgres source in a few > places. If postgres used an internal implementation instead of the library > source it could have implementations that don't use function pointers. There are calls to qsort in upwards of 40 different source files; many of those files contain multiple comparator functions. Doesn't look like "a few" to me. But I'll grant that the vast majority are not performance critical. One could imagine putting a custom implementation into only tuplesort.c, say, where you could certainly get rid of one level of function call (qsort_comparetup) by providing a saner API that passes a state pointer as well as a function pointer to the qsort code. Whether that would be worth the trouble is a question for experiment ... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| On Thu, 15 Dec 2005, Greg Stark wrote: > > I have a related question. qsort is only used in the postgres source in a few > places. If postgres used an internal implementation instead of the library > source it could have implementations that don't use function pointers. This > might perform faster for a few reasons. The comparator is much more likely to > be eligible for inlining for one. > > It also opens the door to using different sort algorithms for different > applications. There may be some cases where the input is never sorted and the > sample size is small so qsort is a good choice, and others where the inputs > can be large and using a better algorithm with worse overhead like mergesort > might make more sense. > > Unfortunately there isn't just a single qsort call though. I count 6 > comparators in the source tree I have. So perhaps this is a non-starter. > Having 6 qsort implementations around sounds pretty sketchy. > [also with reply to Tom] Both ideas look like doable (or at least testable) for me. I agree that the only interesting pot is in tuplesort.c. For the 2nd idea, for smaller range, we may consider radix sort, which is definitely faster - but this may need some work that enable query optimizer know the *exact* data range. Regards, Qingqing ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| [Face flushed begin] Thanks for Greg "let" me take a second look at qsortB.c - there is paste-and-copy error there, so when it perform recursive sort, it calls glibc's qsort ... Really sorry, and feel a little bit gun-shy now ... [Face flushed continue] After I re-tested it, now BSD qsort is the obvious winner in most situations. [Face flushed end] Regards, Qingqing |
| ||||
| Qingqing, On 12/15/05 6:33 PM, "Qingqing Zhou" <zhouqq@cs.toronto.edu> wrote: > Thanks for Greg "let" me take a second look at qsortB.c - there is > paste-and-copy error there, so when it perform recursive sort, it calls > glibc's qsort ... Really sorry, and feel a little bit gun-shy now ... > > After I re-tested it, now BSD qsort is the obvious winner in most > situations. :-D Can you post the new results like the last post? - Luke ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |