This is a discussion on Re: "index size" for a particular table by querying the sysstem tables within the Informix forums, part of the Database Server Software category; --> --0-683474517-1097354829=:99715 Content-Type: text/plain; charset=us-ascii Jane, Without doing any work or research on this I wonder if the difference is ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| --0-683474517-1097354829=:99715 Content-Type: text/plain; charset=us-ascii Jane, Without doing any work or research on this I wonder if the difference is in the 4 byte overhead for each index. This theory stems from the large chunk capabilities of 9.4. I have no basis for this theory it was just a passing thought. "June C. Hunt" <june.c.hunt@gmail.com> wrote: On Thu, 07 Oct 2004 13:52:20 -0500, REBELLO, Rulesh Felix wrote: > Hello Group: > > Can I find the "index size" for a particular table by querying the > sysstem tables .. ..??? > > It is the silimar outout which we get in the dbschema utility fo the ifo > in the .sql file of a dbexport. I might only be able to help a small bit. The following was posted to c.d.i. by Rajib Sarkar of IBM back on 2002-06-10: Step 1: ========= select sum(c.collength) from syscolumns c, sysindexes i where i.tabid = c.tabid and i.tabid = and (c.colno = ABS(i.part1) or c.colno = ABS(i.part2) or c.colno = ABS(i.part3) or c.colno = ABS(i.part4) or c.colno = ABS(i.part5) or c.colno = ABS(i.part6) or c.colno = ABS(i.part7) or c.colno = ABS(i.part8) or c.colno = ABS(i.part9) or c.colno = ABS(i.part10) or c.colno = ABS(i.part11) or c.colno = ABS(i.part12) or c.colno = ABS(i.part13) or c.colno = ABS(i.part14) or c.colno = ABS(i.part15) or c.colno = ABS(i.part16)); This will just give you the width of the index, now you would require to add the overhead to it, and the formula would be: Step 2 ========= select count(*) from sysindexes where tabid = ; index length = ( + 4 * ) * 3/2; I've tested this formula against IDS 9.20.UC3 and get the index size that I expect. I've also tested this formula against IDS 9.40.FC2 and it does not come out to be the value I would expect. I'm not sure if the difference is due entirely to the version of IDS, in some portion to the 32 versus 64-bit difference, a combination of the two, or something completely different. I've spent some time looking at the Administrator's Guide (Art Kagel referenced that in a post that dealt with this same issue) and the Performance Guide (the Admin Guide references the Performance Guide), but haven't found an answer yet. I will continue to look for a formula that will work with IDS 9.4 (I can be stubborn that way), but would be happy to end my search if anyone else cares to post the correct answer. If not, I'll post what I find if and when I get a chance to get back to this.... -- June Hunt P.S. There appears to have been a minor glitch with the list... Better now? --------------------------------- Do you Yahoo!? Yahoo! Mail - You care about security. So do we. --0-683474517-1097354829=:99715 Content-Type: text/html; charset=us-ascii <DIV>Jane,</DIV> <DIV> </DIV> <DIV>Without doing any work or research on this I wonder if the difference is in the 4 byte overhead for each index. This theory stems from the large chunk capabilities of 9.4. I have no basis for this theory it was just a passing thought.</DIV> <DIV><BR><B><I>"June C. Hunt" <june.c.hunt@gmail.com></I></B> wrote:</DIV> <BLOCKQUOTE class=replbq style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #1010ff 2px solid">On Thu, 07 Oct 2004 13:52:20 -0500, REBELLO, Rulesh Felix<BR><RREBELLO@FSL.ORG.JM>wrote:<BR>> Hello Group:<BR>><BR>> Can I find the "index size" for a particular table by querying the<BR>> sysstem tables .. ..???<BR>><BR>> It is the silimar outout which we get in the dbschema utility fo <BR>the ifo<BR>> in the .sql file of a dbexport.<BR><BR>I might only be able to help a small bit. The following was posted to<BR>c.d.i. by Rajib Sarkar of IBM back on 2002-06-10:<BR><BR>Step 1:<BR>=========<BR>select sum(c.collength)<BR>from syscolumns c, sysindexes i<BR>where i.tabid = c.tabid<BR>and i.tabid = <YOUR tabid><BR>and (c.colno = ABS(i.part1) or<BR>c.colno = ABS(i.part2) or<BR>c.colno = ABS(i.part3) or<BR>c.colno = ABS(i.part4) or<BR>c.colno = ABS(i.part5) or<BR>c.colno = ABS(i.part6) or<BR>c.colno = ABS(i.part7) or<BR>c.colno = ABS(i.part8) or<BR>c.colno = ABS(i.part9) or<BR>c.colno = ABS(i.part10) or<BR>c.colno = ABS(i.part11) or<BR>c.colno = ABS(i.part12) or<BR>c.colno = ABS(i.part13) or<BR>c.colno = ABS(i.part14) or<BR>c.colno = ABS(i.part15) or<BR>c.colno = ABS(i.part16));<BR><BR>This will just give you the width of the index, now you would require to<BR>add the overhead to it, and the formula would be:<BR>Step 2<BR>=========<BR>select count(*) from sysindexes where tabid = <TABID>;<BR><BR>index length = (<RESULT 1 Step of> + 4 * <RESULT Step of 2>) * 3/2;<BR><BR>I've tested this formula against IDS 9.20.UC3 and get the index size<BR>that I expect. I've also tested this formula against IDS 9.40.FC2 and<BR>it does not come out to be the value I would expect. I'm not sure if<BR>the difference is due entirely to the version of IDS, in some portion<BR>to the 32 versus 64-bit difference, a combination of the two, or<BR>something completely different. I've spent some time looking at the<BR>Administrator's Guide (Art Kagel referenced that in a post that dealt<BR>with this same issue) and the Performance Guide (the Admin Guide<BR>references the Performance Guide), but haven't found an answer yet.<BR><BR>I will continue to look for a formula that will work with IDS 9.4 (I<BR>can be stubborn that way), but would be happy to end my search if<BR>anyone else cares to post the correct answer. If not, I'll post what<BR>I find if and when I get a chance to get back to this....<BR><BR>--<BR>June Hunt<BR><BR>P.S. There appears to have been a minor glitch with the list... Better now?<BR></BLOCKQUOTE><p> <hr size=1>Do you Yahoo!?<br> <a href="http://us.rd.yahoo.com/mail_us/taglines/security/*http://promotions.yahoo.com/new_mail/static/protection.html">Yahoo! Mail</a> - You care about security. So do we. --0-683474517-1097354829=:99715-- sending to informix-list |
| ||||
| DL Redden wrote: > Without doing any work or research on this I wonder if the > difference is in the 4 byte overhead for each index. This theory > stems from the large chunk capabilities of 9.4. I have no basis for > this theory it was just a passing thought. The '+4' factor is a bit questionable in general. For non-fragmented tables, the value should be '+5' - 4 bytes for the rowid plus 1 byte for the delete marker. If you have a fragmented table, it needs to be '+9' (4 bytes fragid, 4 bytes rowid, 1 byte delete marker). But there are a couple of more deep-seated problems... > "June C. Hunt" <june.c.hunt@gmail.com> wrote: > REBELLO, Rulesh Felix wrote [with fixups]: >>Can I find the "index size" for a particular table by querying the >>system tables .. ..??? >> >> It is the similar output which we get in the dbschema utility or >> the info in the .sql file of a dbexport. > > I might only be able to help a small bit. The following was posted to > c.d.i. by Rajib Sarkar of IBM back on 2002-06-10: > > Step 1: > ========= > select sum(c.collength) > from syscolumns c, sysindexes i > where i.tabid = c.tabid > and i.tabid = > and (c.colno = ABS(i.part1) or > c.colno = ABS(i.part2) or > c.colno = ABS(i.part3) or > c.colno = ABS(i.part4) or > c.colno = ABS(i.part5) or > c.colno = ABS(i.part6) or > c.colno = ABS(i.part7) or > c.colno = ABS(i.part8) or > c.colno = ABS(i.part9) or > c.colno = ABS(i.part10) or > c.colno = ABS(i.part11) or > c.colno = ABS(i.part12) or > c.colno = ABS(i.part13) or > c.colno = ABS(i.part14) or > c.colno = ABS(i.part15) or > c.colno = ABS(i.part16)); The use of ABS is good - in the rare cases where you have an index sorted into descending order (or a key within an index sorted in descending order), the column number is negative. Unfortunately, the use of SUM(c.collength) is woefully wrong. It will work correctly for columns of the types: CHAR, NCHAR, VARCHAR(n), VARCHAR(n,0), NVARCHAR(n), NVARCHAR(n,0), INT, SMALLINT, FLOAT, SMALLFLOAT, SERIAL, SERIAL8, INT8, BOOLEAN, DATE. I'll take LVARCHAR under advisement - offhand, I'm not sure. The problem children are DECIMAL, MONEY, DATETIME, INTERVAL and [N]VARCHAR(n,m) for m>0. In each case, the ColLength value is encoded, and the encoding is moderately complex. The easiest is the VARCHAR values; the lower byte is the maximum length (n), but the upper byte contains the minimum length (m). So, a VARCHAR(40,10) is encoded as 10 * 256 + 40 = 2600, and adding that into your index size gives an erroneous value. DECIMAL and MONEY values are encoded with scale and precision - so DECIMAL(16,4) is 16 * 256 + 4 = 4100 (he says, working from a header file and not from an actual database; there's an outside chance any given calculation is wrong, but the overall principle remains accurate). The actual length on disk - and in index - for DECIMAL(16,4) is 9 bytes. The encoding of DATETIME is still more complex - and of INTERVAL more complex yet. But the numbers in the collength column are certainly not simply addable as shown in the query above. > This will just give you the width of the index, now you would require to > add the overhead to it, and the formula would be: > Step 2 > ========= > select count(*) from sysindexes where tabid = ; > > index length = ( + 4 * ) * 3/2; The second step is interesting - I can't think why there would be a 3/2 factor in the index size - unless the number of rows was included. Then the computation would be for the total size in bytes used by an index, and you do need a fudge factor in there, and 1.5 is a possible value to use. But that isn't what was requested; the request was for the width of an index as reported by dbschema or dbexport, and that is the size of one key entry, and that does not need a fudge factor. There is no difference between 32-bit and 64-bit indexes. > I've tested this formula against IDS 9.20.UC3 and get the index size > that I expect. I've also tested this formula against IDS 9.40.FC2 and > it does not come out to be the value I would expect. I'm not sure if > the difference is due entirely to the version of IDS, in some portion > to the 32 versus 64-bit difference, a combination of the two, or > something completely different. I've spent some time looking at the > Administrator's Guide (Art Kagel referenced that in a post that dealt > with this same issue) and the Performance Guide (the Admin Guide > references the Performance Guide), but haven't found an answer yet. > > I will continue to look for a formula that will work with IDS 9.4 (I > can be stubborn that way), but would be happy to end my search if > anyone else cares to post the correct answer. If not, I'll post what > I find if and when I get a chance to get back to this.... -- Jonathan Leffler #include <disclaimer.h> Email: jleffler@earthlink.net, jleffler@us.ibm.com Guardian of DBD::Informix v2003.04 -- http://dbi.perl.org/ |