This is a discussion on Calculate optimal value for DS_POOLSIZE within the Informix forums, part of the Database Server Software category; --> Version: IDS 10 I am trying to determine the optimal value for DS_POOLSIZE and DS_HASHSIZE. I looked at the ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Version: IDS 10 I am trying to determine the optimal value for DS_POOLSIZE and DS_HASHSIZE. I looked at the onstat -g dsc. I have following question: 1. How do I determine the correct size of DS_POOLSIZE. Should I look at our update stats and see number of unique columns that we run update statistics on and that would become the value of DS_POOLSIZE. 2. How do I determine correct value of number of Hash Buckets ? How many entries per bucket is good to have ? 3. When I look at onstat -g dsc I see number of distribution entries, should this be always lower than DS_POOLSIZE |
| |||
| On 13 Oct, 20:58, mohitanch...@gmail.com wrote: > Version: IDS 10 > > I am trying to determine the optimal value for DS_POOLSIZE and > DS_HASHSIZE. I looked at the onstat -g dsc. I have following question: > > 1. How do I determine the correct size of DS_POOLSIZE. Should I look > at our update stats and see number of unique columns that we run > update statistics on and that would become the value of DS_POOLSIZE. > 2. How do I determine correct value of number of Hash Buckets ? How > many entries per bucket is good to have ? > 3. When I look at onstat -g dsc I see number of distribution entries, > should this be always lower than DS_POOLSIZE DS_POOLSIZE is the number of entrires in the data distribution cache. Therefore select count(*) from sysdistrib for each database and that will be the maximum possible value you can use. Run onstat -g dsc to see how many you are using at the moment and choose a sensible value. Be aware that with IDS 10.00.FC4 on Linux I found having this too large can have a large impact on performance. Be aware the number of data distibution entries can increase and people add tables and indexes to the database schema. DS_HASHSZIE should be set so you have plenty of hash buckets to avoid contention for each has bucket. Use 31,63,127 and see which is best for your situation, perhaps try some other values in between e.g. 95. |
| |||
| On Oct 13, 3:58 pm, mohitanch...@gmail.com wrote: > Version: IDS 10 > > I am trying to determine the optimal value for DS_POOLSIZE and > DS_HASHSIZE. I looked at the onstat -g dsc. I have following question: > > 1. How do I determine the correct size of DS_POOLSIZE. Should I look > at our update stats and see number of unique columns that we run > update statistics on and that would become the value of DS_POOLSIZE. > 2. How do I determine correct value of number of Hash Buckets ? How > many entries per bucket is good to have ? > 3. When I look at onstat -g dsc I see number of distribution entries, > should this be always lower than DS_POOLSIZE IB that Mark and David are misinterpreting the use and meaning of DS_HASHSIZE and DS_POOLSIZE, or at least not explaining it clearly. DS_HASHSIZE is the number of hash buckets for distributions and DS_POOLSIZE is the maximum number of slots in each bucket for when multiple distributions hash to the same bucket. DS_HASHSIZE must be a prime number, DS_POOLSIZE should be relatively small (I've also been told that it's better if it is non-prime). Columns are hashed on table and column name so if you have lots of databases with the same tables and columns in them then you will need a larger value for DS_POOLSIZE to match the level of hash clashes that you have. Ideally - for optimum performance of the hash - each distribution should hash to it's own bucket and the number of slots per bucket should be very low. Since we cannot adjust the hash algorithm itself, the only control we have is DS_HASHSIZE. Keeping David's experience with Linux and large values in mind (David what is a 'large' value in your testing?) I'd set DS_HASHSIZE to at a prime that's larger than the number of unique table names to minimize clashes and set DS_POOLSIZE large enough that the product of the two is about 1.5 times the number of distributions in SYSDISTRIB for all databases - allowing a bit more if you know the number of databases/ tables/columns will be growing over time to put off retuning for a while. Art S. Kagel |
| ||||
| On 15 Oct, 14:49, "Art S. Kagel" <art.ka...@gmail.com> wrote: > On Oct 13, 3:58 pm, mohitanch...@gmail.com wrote: > > > Version: IDS 10 > > > I am trying to determine the optimal value for DS_POOLSIZE and > > DS_HASHSIZE. I looked at the onstat -g dsc. I have following question: > > > 1. How do I determine the correct size of DS_POOLSIZE. Should I look > > at our update stats and see number of unique columns that we run > > update statistics on and that would become the value of DS_POOLSIZE. > > 2. How do I determine correct value of number of Hash Buckets ? How > > many entries per bucket is good to have ? > > 3. When I look at onstat -g dsc I see number of distribution entries, > > should this be always lower than DS_POOLSIZE > > IB that Mark and David are misinterpreting the use and meaning of > DS_HASHSIZE and DS_POOLSIZE, or at least not explaining it clearly. > > DS_HASHSIZE is the number of hash buckets for distributions and > DS_POOLSIZE is the maximum number of slots in each bucket for when > multiple distributions hash to the same bucket. DS_HASHSIZE must be a > prime number, DS_POOLSIZE should be relatively small (I've also been > told that it's better if it is non-prime). > > Columns are hashed on table and column name so if you have lots of > databases with the same tables and columns in them then you will need > a larger value for DS_POOLSIZE to match the level of hash clashes that > you have. Ideally - for optimum performance of the hash - each > distribution should hash to it's own bucket and the number of slots > per bucket should be very low. Since we cannot adjust the hash > algorithm itself, the only control we have is DS_HASHSIZE. Keeping > David's experience with Linux and large values in mind (David what is > a 'large' value in your testing?) I'd set DS_HASHSIZE to at a prime > that's larger than the number of unique table names to minimize > clashes and set DS_POOLSIZE large enough that the product of the two > is about 1.5 times the number of distributions in SYSDISTRIB for all > databases - allowing a bit more if you know the number of databases/ > tables/columns will be growing over time to put off retuning for a > while. > > Art S. Kagel I was deliberately vague about what "large" is since I can't remember, maybe DS_POOLSIZE*DS_HASHSIZE=40000. |
| Thread Tools | |
| Display Modes | |
|
|