Unix Technical Forum

Re: Which qsort is used

This is a discussion on Re: Which qsort is used within the pgsql Hackers forums, part of the PostgreSQL category; --> On Mon, 12 Dec 2005, Luke Lonergan wrote: > > Might you have time to implement these within the ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-11-2008, 08:20 AM
Qingqing Zhou
 
Posts: n/a
Default Re: Which qsort is used



On Mon, 12 Dec 2005, Luke Lonergan wrote:
>
> Might you have time to implement these within the testing framework I
> published previously? It has both the NetBSD and qsortG included along with
> a timing routine, etc.
>


Here we go:

http://www.cs.toronto.edu/~zhouqq/po...sort/sort.html

The source tar ball and linux 2.4G gcc 2.96 test results is on the page.
There is a clear loser glibc, not sure qsortB or qsortG which is better.

Regards,
Qingqing

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-11-2008, 08:21 AM
Luke Lonergan
 
Posts: n/a
Default Re: Which qsort is used

Qingqing,

On 12/13/05 10:28 AM, "Qingqing Zhou" <zhouqq@cs.toronto.edu> wrote:

> http://www.cs.toronto.edu/~zhouqq/po...sort/sort.html
>
> The source tar ball and linux 2.4G gcc 2.96 test results is on the page.
> There is a clear loser glibc, not sure qsortB or qsortG which is better.


Great stuff - thanks for doing this.

From the results, it's clear that the scale test makes a huge difference in
the relative performance. I'm wondering if it's an L2 cache effect, as it
seems to occur in that range.

Overall - I'd say that the BSD routine is showing the best overall results
when the scale test is included. The qsortG routine has some significantly
better performance in certain cases at smaller sort set sizes - it could
probably be improved for better L2 use, but BSD is already there.

Based on this it seems like we should expose the option to choose the BSD
qsort routine at configure time.

- Luke



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-11-2008, 08:21 AM
Qingqing Zhou
 
Posts: n/a
Default Re: Which qsort is used


On Wed, 14 Dec 2005, Luke Lonergan wrote:
>
> Overall - I'd say that the BSD routine is showing the best overall results
> when the scale test is included. The qsortG routine has some significantly
> better performance in certain cases at smaller sort set sizes - it could
> probably be improved for better L2 use, but BSD is already there.
>
> Based on this it seems like we should expose the option to choose the BSD
> qsort routine at configure time.
>


Before we pin down this, I hope more extensive tests on various platforms
could be done. So we could give some suggestions when we should enable the
"--enable-bsdqsort" option. I can post a result on a SunOS machine (but
the problem is that many ppl share this machine) and a windows machine.

Regards,
Qingqing

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-11-2008, 08:22 AM
Jim C. Nasby
 
Posts: n/a
Default Re: Which qsort is used

On Thu, Dec 15, 2005 at 12:10:37AM -0500, Qingqing Zhou wrote:
>
> On Wed, 14 Dec 2005, Luke Lonergan wrote:
> >
> > Overall - I'd say that the BSD routine is showing the best overall results
> > when the scale test is included. The qsortG routine has some significantly
> > better performance in certain cases at smaller sort set sizes - it could
> > probably be improved for better L2 use, but BSD is already there.
> >
> > Based on this it seems like we should expose the option to choose the BSD
> > qsort routine at configure time.
> >

>
> Before we pin down this, I hope more extensive tests on various platforms
> could be done. So we could give some suggestions when we should enable the
> "--enable-bsdqsort" option. I can post a result on a SunOS machine (but
> the problem is that many ppl share this machine) and a windows machine.


I have access to both some (SLOW) ultra5's and a machine running
opensolaris on AMD if testing there would help. I'll need a pointer to a
patch and test-case though...

Oh, I also have access to an old SGI...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-11-2008, 08:22 AM
Qingqing Zhou
 
Posts: n/a
Default Re: Which qsort is used




On Thu, 15 Dec 2005, Jim C. Nasby wrote:
>
> I have access to both some (SLOW) ultra5's and a machine running
> opensolaris on AMD if testing there would help. I'll need a pointer to a
> patch and test-case though...
>


Thanks! I've patched the program with the following changes:
(1) add gcc-mingw support;
(2) move the check_sort() out of do_sort() - the previous program is
actually measuring the time of qsort plus result verification. Though that
one "fairly" add equal cost to every competitor, which will not affect the
confidence of the result, it is a defeat, sorry about that;

The new results with SunOS and Windows tests are published at the same
place:
http://www.cs.toronto.edu/~zhouqq/po...sort/sort.html

As Luke suggested, BSD seems a good choice for scalable and stable
consideration. But I also sent an email to the author of qsortG, and he
might take a look at the small-range performance problem during the
holiday. So if he can help that, then we will have another candidate.

By the way, I do spend some time on fighting the win32 gettimeofday()
emulation. I would suggest adding a comment like "don't use this method in
windows to get high precision time, use elapsed_time() instead" ...

Regards,
Qingqing


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-11-2008, 08:22 AM
Greg Stark
 
Posts: n/a
Default Re: Which qsort is used


> > > Based on this it seems like we should expose the option to choose the BSD
> > > qsort routine at configure time.


I have a related question. qsort is only used in the postgres source in a few
places. If postgres used an internal implementation instead of the library
source it could have implementations that don't use function pointers. This
might perform faster for a few reasons. The comparator is much more likely to
be eligible for inlining for one.

It also opens the door to using different sort algorithms for different
applications. There may be some cases where the input is never sorted and the
sample size is small so qsort is a good choice, and others where the inputs
can be large and using a better algorithm with worse overhead like mergesort
might make more sense.

Unfortunately there isn't just a single qsort call though. I count 6
comparators in the source tree I have. So perhaps this is a non-starter.
Having 6 qsort implementations around sounds pretty sketchy.

--
greg


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-11-2008, 08:22 AM
Tom Lane
 
Posts: n/a
Default Re: Which qsort is used

Greg Stark <gsstark@mit.edu> writes:
> I have a related question. qsort is only used in the postgres source in a few
> places. If postgres used an internal implementation instead of the library
> source it could have implementations that don't use function pointers.


There are calls to qsort in upwards of 40 different source files; many
of those files contain multiple comparator functions. Doesn't look like
"a few" to me. But I'll grant that the vast majority are not
performance critical. One could imagine putting a custom implementation
into only tuplesort.c, say, where you could certainly get rid of one
level of function call (qsort_comparetup) by providing a saner API that
passes a state pointer as well as a function pointer to the qsort code.
Whether that would be worth the trouble is a question for experiment ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-11-2008, 08:22 AM
Qingqing Zhou
 
Posts: n/a
Default Re: Which qsort is used



On Thu, 15 Dec 2005, Greg Stark wrote:

>
> I have a related question. qsort is only used in the postgres source in a few
> places. If postgres used an internal implementation instead of the library
> source it could have implementations that don't use function pointers. This
> might perform faster for a few reasons. The comparator is much more likely to
> be eligible for inlining for one.
>
> It also opens the door to using different sort algorithms for different
> applications. There may be some cases where the input is never sorted and the
> sample size is small so qsort is a good choice, and others where the inputs
> can be large and using a better algorithm with worse overhead like mergesort
> might make more sense.
>
> Unfortunately there isn't just a single qsort call though. I count 6
> comparators in the source tree I have. So perhaps this is a non-starter.
> Having 6 qsort implementations around sounds pretty sketchy.
>


[also with reply to Tom] Both ideas look like doable (or at least
testable) for me. I agree that the only interesting pot is in tuplesort.c.
For the 2nd idea, for smaller range, we may consider radix sort, which is
definitely faster - but this may need some work that enable query
optimizer know the *exact* data range.

Regards,
Qingqing





---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-11-2008, 08:22 AM
Qingqing Zhou
 
Posts: n/a
Default Re: Which qsort is used


[Face flushed begin]

Thanks for Greg "let" me take a second look at qsortB.c - there is
paste-and-copy error there, so when it perform recursive sort, it calls
glibc's qsort ... Really sorry, and feel a little bit gun-shy now ...

[Face flushed continue]

After I re-tested it, now BSD qsort is the obvious winner in most
situations.

[Face flushed end]

Regards,
Qingqing


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-11-2008, 08:22 AM
Luke Lonergan
 
Posts: n/a
Default Re: Which qsort is used

Qingqing,


On 12/15/05 6:33 PM, "Qingqing Zhou" <zhouqq@cs.toronto.edu> wrote:
> Thanks for Greg "let" me take a second look at qsortB.c - there is
> paste-and-copy error there, so when it perform recursive sort, it calls
> glibc's qsort ... Really sorry, and feel a little bit gun-shy now ...
>
> After I re-tested it, now BSD qsort is the obvious winner in most
> situations.


:-D

Can you post the new results like the last post?

- Luke



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 11:52 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com