Unix Technical Forum

Re: choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

This is a discussion on Re: choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance) within the pgsql Hackers forums, part of the PostgreSQL category; --> > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto gsql-hackers- > owner@postgresql.org ] On Behalf Of Alvaro Herrera > Sent: Friday, ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-12-2008, 09:18 AM
Dann Corbit
 
Posts: n/a
Default Re: choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailtogsql-hackers-
> owner@postgresql.org] On Behalf Of Alvaro Herrera
> Sent: Friday, April 13, 2007 4:24 PM
> To: Tom Lane
> Cc: pgsql-hackers@postgreSQL.org; PostgreSQL Performance; Steve
> Subject: Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM]
> Strangely Variable Query Performance)
>
> Tom Lane wrote:
>
> > One idea I thought about was to sort by index scan cost, using
> > selectivity only as a tiebreaker for cost, rather than the other way
> > around as is currently done. This seems fairly plausible because
> > indexscans that are cheaper than other indexscans likely return

fewer
> > rows too, and so selectivity is already accounted for to some extent

---
> > at least you can't have an enormously worse selectivity at lower

cost,
> > whereas Steve's example proves it doesn't work the other way. But

I'm
> > worried about breaking the reasoning about redundant indexes that's
> > mentioned in the comments.
> >
> > Another alternative that would respond to the immediate problem is

to
> > maintain the current sort order, but as we come to each index,

consider
> > using that one alone, and throw away whatever AND we might have

built up
> > if that one alone beats the AND-so-far. This seems more

conservative,
> > as it's unlikely to break any cases that work well now, but on the

other
> > hand it feels like plastering another wart atop a structure that's
> > already rather rickety.
> >
> > Has anyone got any thoughts about the best way to do this?

>
> How about doing both: sort the index by index scan cost; then pick the
> first index on the list and start adding indexes when they lower the
> cost. When adding each index, consider it by itself against the
> already stacked indexes. If the cost is lower, put this index at the
> top of the list, and restart the algorithm (after the sorting step of
> course).
>
> I think the concern about condition redundancy should be attacked
> separately. How about just comparing whether they have common

prefixes
> of conditions? I admit I don't understand what would happen with
> indexes defined like (lower(A), B, C) versus (A, B) for example.


Instead of sorting, I suggest the quickselect() algorithm, which is
O(n).
Probably, if the list is small, it won't matter much, but it might offer
some tangible benefit.

Here is an example of the algorithm:

#include <stdlib.h>
typedef double Etype; /* Season to taste. */

extern Etype RandomSelect(Etype * A, size_t p, size_t r, size_t i);
extern size_t RandRange(size_t a, size_t b);
extern size_t RandomPartition(Etype * A, size_t p, size_t r);
extern size_t Partition(Etype * A, size_t p, size_t r);

/*
**
** In the following code, every reference to CLR means:
**
** "Introduction to Algorithms"
** By Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
** ISBN 0-07-013143-0
*/


/*
** CLR, page 187
*/
Etype RandomSelect(Etype A[], size_t p, size_t r, size_t i)
{
size_t q,
k;
if (p == r)
return A[p];
q = RandomPartition(A, p, r);
k = q - p + 1;

if (i <= k)
return RandomSelect(A, p, q, i);
else
return RandomSelect(A, q + 1, r, i - k);
}

size_t RandRange(size_t a, size_t b)
{
size_t c = (size_t) ((double) rand() / ((double) RAND_MAX + 1) * (b
- a));
return c + a;
}

/*
** CLR, page 162
*/
size_t RandomPartition(Etype A[], size_t p, size_t r)
{
size_t i = RandRange(p, r);
Etype Temp;
Temp = A[p];
A[p] = A[i];
A[i] = Temp;
return Partition(A, p, r);
}

/*
** CLR, page 154
*/
size_t Partition(Etype A[], size_t p, size_t r)
{
Etype x,
temp;
size_t i,
j;

x = A[p];
i = p - 1;
j = r + 1;

for (; {
do {
j--;
} while (!(A[j] <= x));
do {
i++;
} while (!(A[i] >= x));
if (i < j) {
temp = A[i];
A[i] = A[j];
A[j] = temp;
} else
return j;
}
}

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:30 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com