Unix Technical Forum

Calculate optimal value for DS_POOLSIZE

This is a discussion on Calculate optimal value for DS_POOLSIZE within the Informix forums, part of the Database Server Software category; --> Version: IDS 10 I am trying to determine the optimal value for DS_POOLSIZE and DS_HASHSIZE. I looked at the ...


Go Back   Unix Technical Forum > Database Server Software > Informix

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-20-2008, 05:28 PM
mohitanchlia@gmail.com
 
Posts: n/a
Default Calculate optimal value for DS_POOLSIZE

Version: IDS 10

I am trying to determine the optimal value for DS_POOLSIZE and
DS_HASHSIZE. I looked at the onstat -g dsc. I have following question:

1. How do I determine the correct size of DS_POOLSIZE. Should I look
at our update stats and see number of unique columns that we run
update statistics on and that would become the value of DS_POOLSIZE.
2. How do I determine correct value of number of Hash Buckets ? How
many entries per bucket is good to have ?
3. When I look at onstat -g dsc I see number of distribution entries,
should this be always lower than DS_POOLSIZE

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-20-2008, 05:28 PM
david@smooth1.co.uk
 
Posts: n/a
Default Re: Calculate optimal value for DS_POOLSIZE

On 13 Oct, 20:58, mohitanch...@gmail.com wrote:
> Version: IDS 10
>
> I am trying to determine the optimal value for DS_POOLSIZE and
> DS_HASHSIZE. I looked at the onstat -g dsc. I have following question:
>
> 1. How do I determine the correct size of DS_POOLSIZE. Should I look
> at our update stats and see number of unique columns that we run
> update statistics on and that would become the value of DS_POOLSIZE.
> 2. How do I determine correct value of number of Hash Buckets ? How
> many entries per bucket is good to have ?
> 3. When I look at onstat -g dsc I see number of distribution entries,
> should this be always lower than DS_POOLSIZE


DS_POOLSIZE is the number of entrires in the data distribution cache.
Therefore select count(*) from sysdistrib for each database and that
will be the
maximum possible value you can use. Run onstat -g dsc to see how many
you are using at the moment and choose a sensible value.

Be aware that with IDS 10.00.FC4 on Linux I found having this too
large can have a large impact on performance. Be aware the number of
data distibution
entries can increase and people add tables and indexes to the database
schema.

DS_HASHSZIE should be set so you have plenty of hash buckets to avoid
contention for each has bucket. Use 31,63,127 and see which is best
for your
situation, perhaps try some other values in between e.g. 95.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-20-2008, 05:28 PM
Art S. Kagel
 
Posts: n/a
Default Re: Calculate optimal value for DS_POOLSIZE

On Oct 13, 3:58 pm, mohitanch...@gmail.com wrote:
> Version: IDS 10
>
> I am trying to determine the optimal value for DS_POOLSIZE and
> DS_HASHSIZE. I looked at the onstat -g dsc. I have following question:
>
> 1. How do I determine the correct size of DS_POOLSIZE. Should I look
> at our update stats and see number of unique columns that we run
> update statistics on and that would become the value of DS_POOLSIZE.
> 2. How do I determine correct value of number of Hash Buckets ? How
> many entries per bucket is good to have ?
> 3. When I look at onstat -g dsc I see number of distribution entries,
> should this be always lower than DS_POOLSIZE


IB that Mark and David are misinterpreting the use and meaning of
DS_HASHSIZE and DS_POOLSIZE, or at least not explaining it clearly.

DS_HASHSIZE is the number of hash buckets for distributions and
DS_POOLSIZE is the maximum number of slots in each bucket for when
multiple distributions hash to the same bucket. DS_HASHSIZE must be a
prime number, DS_POOLSIZE should be relatively small (I've also been
told that it's better if it is non-prime).

Columns are hashed on table and column name so if you have lots of
databases with the same tables and columns in them then you will need
a larger value for DS_POOLSIZE to match the level of hash clashes that
you have. Ideally - for optimum performance of the hash - each
distribution should hash to it's own bucket and the number of slots
per bucket should be very low. Since we cannot adjust the hash
algorithm itself, the only control we have is DS_HASHSIZE. Keeping
David's experience with Linux and large values in mind (David what is
a 'large' value in your testing?) I'd set DS_HASHSIZE to at a prime
that's larger than the number of unique table names to minimize
clashes and set DS_POOLSIZE large enough that the product of the two
is about 1.5 times the number of distributions in SYSDISTRIB for all
databases - allowing a bit more if you know the number of databases/
tables/columns will be growing over time to put off retuning for a
while.

Art S. Kagel


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-20-2008, 05:28 PM
david@smooth1.co.uk
 
Posts: n/a
Default Re: Calculate optimal value for DS_POOLSIZE

On 15 Oct, 14:49, "Art S. Kagel" <art.ka...@gmail.com> wrote:
> On Oct 13, 3:58 pm, mohitanch...@gmail.com wrote:
>
> > Version: IDS 10

>
> > I am trying to determine the optimal value for DS_POOLSIZE and
> > DS_HASHSIZE. I looked at the onstat -g dsc. I have following question:

>
> > 1. How do I determine the correct size of DS_POOLSIZE. Should I look
> > at our update stats and see number of unique columns that we run
> > update statistics on and that would become the value of DS_POOLSIZE.
> > 2. How do I determine correct value of number of Hash Buckets ? How
> > many entries per bucket is good to have ?
> > 3. When I look at onstat -g dsc I see number of distribution entries,
> > should this be always lower than DS_POOLSIZE

>
> IB that Mark and David are misinterpreting the use and meaning of
> DS_HASHSIZE and DS_POOLSIZE, or at least not explaining it clearly.
>
> DS_HASHSIZE is the number of hash buckets for distributions and
> DS_POOLSIZE is the maximum number of slots in each bucket for when
> multiple distributions hash to the same bucket. DS_HASHSIZE must be a
> prime number, DS_POOLSIZE should be relatively small (I've also been
> told that it's better if it is non-prime).
>
> Columns are hashed on table and column name so if you have lots of
> databases with the same tables and columns in them then you will need
> a larger value for DS_POOLSIZE to match the level of hash clashes that
> you have. Ideally - for optimum performance of the hash - each
> distribution should hash to it's own bucket and the number of slots
> per bucket should be very low. Since we cannot adjust the hash
> algorithm itself, the only control we have is DS_HASHSIZE. Keeping
> David's experience with Linux and large values in mind (David what is
> a 'large' value in your testing?) I'd set DS_HASHSIZE to at a prime
> that's larger than the number of unique table names to minimize
> clashes and set DS_POOLSIZE large enough that the product of the two
> is about 1.5 times the number of distributions in SYSDISTRIB for all
> databases - allowing a bit more if you know the number of databases/
> tables/columns will be growing over time to put off retuning for a
> while.
>
> Art S. Kagel


I was deliberately vague about what "large" is since I can't remember,
maybe DS_POOLSIZE*DS_HASHSIZE=40000.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 08:47 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com