Unix Technical Forum

storage optimization for read-only table

This is a discussion on storage optimization for read-only table within the MySQL forums, part of the Database Server Software category; --> Hello, According to MySQL document: The expected row length for dynamic-sized rows is calculated using the following expression: 3 ...


Go Back   Unix Technical Forum > Database Server Software > MySQL

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-28-2008, 08:30 AM
mhuiyang@yahoo.com
 
Posts: n/a
Default storage optimization for read-only table

Hello,

According to MySQL document:
The expected row length for dynamic-sized rows is calculated using
the following expression:
3
+ (number of columns + 7) / 8
+ (number of char columns)
+ (packed size of numeric columns)
+ (length of strings)
+ (number of NULL columns + 7) / 8

However, I found that this formula is not accurate for the following
trivial case.
mysql> desc vc2;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| c | varchar(64) | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
1 row in set (0.00 sec)

mysql> select * from vc2;
+------+
| c |
+------+
| '' |
| '' |
| '' |
| '' |
| '' |
| '' |
| '' |
| '' |
| '' |
| '' |
+------+
10 rows in set (0.00 sec)

mysql> quit
Bye
$ ls -lt vc2*
-rw-rw---- 1 mysql users 200 Nov 4 11:02 vc2.MYD
-rw-rw---- 1 mysql users 1024 Nov 4 11:02 vc2.MYI
-rw-rw---- 1 mysql users 8554 Nov 4 11:02 vc2.frm
$ od -x vc2.MYD
0000000 0003 0b05 fe01 2702 0027 0000 0000 0000
0000020 0000 0000 0003 0b05 fe01 2702 0027 0000
0000040 0000 0000 0000 0000 0003 0b05 fe01 2702
0000060 0027 0000 0000 0000 0000 0000 0003 0b05
0000100 fe01 2702 0027 0000 0000 0000 0000 0000
0000120 0003 0b05 fe01 2702 0027 0000 0000 0000
0000140 0000 0000 0003 0b05 fe01 2702 0027 0000
0000160 0000 0000 0000 0000 0003 0b05 fe01 2702
0000200 0027 0000 0000 0000 0000 0000 0003 0b05
0000220 fe01 2702 0027 0000 0000 0000 0000 0000
0000240 0003 0b05 fe01 2702 0027 0000 0000 0000
0000260 0000 0000 0003 0b05 fe01 2702 0027 0000
0000300 0000 0000 0000 0000
0000310

Note that for 10 empty string values, the disk storage is 200 bytes. If
we use the fomula, each row should have a length of
3
+ (1 + 7) / 8
+ (0)
+ (0)
+ (1)
+ (1 + 7) / 8
= 6
So for 10 rows, the expected storage size is 60 bytes.

Where does the extra 140 bytes used for?

I have a case where the table is used only for archiving so it is a
write-
once and read-only table. Is there any special syntax for the create
table
statement to optimize disk storage without doing compression?

Thanks,
Minghui

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-28-2008, 08:31 AM
Axel Schwenke
 
Posts: n/a
Default Re: storage optimization for read-only table

mhuiyang@yahoo.com wrote:
>
> According to MySQL document:


What document? The most detailed document (besides the source code) is
the internals manual: http://dev.mysql.com/doc/internals/en/index.html

> mysql> desc vc2;
> +-------+-------------+------+-----+---------+-------+
>| Field | Type | Null | Key | Default | Extra |
> +-------+-------------+------+-----+---------+-------+
>| c | varchar(64) | YES | | NULL | |
> +-------+-------------+------+-----+---------+-------+


> mysql> select * from vc2;
> +------+
>| c |
> +------+
>| '' |

....
> +------+
> 10 rows in set (0.00 sec)
>
> $ ls -lt vc2*
> -rw-rw---- 1 mysql users 200 Nov 4 11:02 vc2.MYD
> -rw-rw---- 1 mysql users 1024 Nov 4 11:02 vc2.MYI
> -rw-rw---- 1 mysql users 8554 Nov 4 11:02 vc2.frm


> $ od -x vc2.MYD

[snip]

do not format with -x, use -t x1 for better formatting:

~ $od -Ax -t x1 .../t1.MYD
000000 03 00 04 0c fe 02 27 27 00 00 00 00 00 00 00 00
000010 00 00 00 00 03 00 04 0c fe 02 27 27 00 00 00 00
000020 00 00 00 00 00 00 00 00 03 00 04 0c fe 02 27 27
000030 00 00 00 00 00 00 00 00 00 00 00 00 03 00 04 0c
000040 fe 02 27 27 00 00 00 00 00 00 00 00 00 00 00 00
000050 03 00 04 0c fe 02 27 27 00 00 00 00 00 00 00 00
000060 00 00 00 00 03 00 04 0c fe 02 27 27 00 00 00 00
000070 00 00 00 00 00 00 00 00 03 00 04 0c fe 02 27 27
000080 00 00 00 00 00 00 00 00 00 00 00 00 03 00 04 0c
000090 fe 02 27 27 00 00 00 00 00 00 00 00 00 00 00 00
0000a0 03 00 04 0c fe 02 27 27 00 00 00 00 00 00 00 00
0000b0 00 00 00 00 03 00 04 0c fe 02 27 27 00 00 00 00
0000c0 00 00 00 00 00 00 00 00


> Note that for 10 empty string values, the disk storage is 200 bytes.


Right. In fact there is a minumum row size of 2+2+16 = 20 bytes if you
use dynamic row format. Also your strings are not empty but contain the
value "''" (two single quote characters).

So the first record is from 0x00 to 0x13.

03 00 ... is the record header
04 ...... number of bytes actually used by the record
0c ...... number of unused bytes, the sum of used + unused is >= 0x10
--- used bytes start here
fe ...... overflow pointer + flags
02 ...... length of the VARCHAR column
27 27 ... the data itself: 0x27 = "'"
--- unused bytes start here
rest .... filler bytes 12x 00


XL
--
Axel Schwenke, Senior Software Developer, MySQL AB

Online User Manual: http://dev.mysql.com/doc/refman/5.0/en/
MySQL User Forums: http://forums.mysql.com/
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-28-2008, 08:31 AM
mhuiyang@yahoo.com
 
Posts: n/a
Default Re: storage optimization for read-only table

Thanks a lot for the answer.

>
> What document? The most detailed document (besides the source code) is
> the internals manual: http://dev.mysql.com/doc/internals/en/index.html
>

http://dev.mysql.com/doc/refman/4.1/...ic-format.html
Section 14.1.3.2. Dynamic Table Characteristics
I checked 5.0 and 5.1 manuals and they have the same formula.

> do not format with -x, use -t x1 for better formatting:


That does look better.

> Right. In fact there is a minumum row size of 2+2+16 = 20 bytes if you
> use dynamic row format. Also your strings are not empty but contain the
> value "''" (two single quote characters).
>


You are right. I was using "load data infile" and I thought that I had
put empty strings by writing '' in the file.

I have another question: when adding a varchar column that mostly
stores empty string (or NULL if that saves more space) does not cost
more disk space? I guess it has to do with the existing number of
columns and check whether it is on the verge of 8's multiple or not. It
appears that in the worst case, adding a nullable varchar column that
stores null or empty string will cost 1 byte per row. Am I right?

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-28-2008, 08:31 AM
Axel Schwenke
 
Posts: n/a
Default Re: storage optimization for read-only table

mhuiyang@yahoo.com wrote:
>>
>> What document? The most detailed document (besides the source code) is
>> the internals manual: http://dev.mysql.com/doc/internals/en/index.html
>>

> http://dev.mysql.com/doc/refman/4.1/...ic-format.html
> Section 14.1.3.2. Dynamic Table Characteristics
> I checked 5.0 and 5.1 manuals and they have the same formula.


Hmm. This seems to be inaccurate at best. Could you please file
a documentation bug at http://bugs.mysql.com ?

>> In fact there is a minumum row size of 2+2+16 = 20 bytes if you
>> use dynamic row format.

>
> I have another question: when adding a varchar column that mostly
> stores empty string (or NULL if that saves more space) does not cost
> more disk space? I guess it has to do with the existing number of
> columns and check whether it is on the verge of 8's multiple or not. It
> appears that in the worst case, adding a nullable varchar column that
> stores null or empty string will cost 1 byte per row. Am I right?


Each nullable column costs you another bit in the flags field. Since
space in the record is allocated in bytes, you can have up to 7
unused bits here. If a nullable column *is* NULL, it does not take
any additional space. However there is still the minimum length of
16 bytes for the data part. So your original table with a single,
nullable VARCHAR(64) column will yield 20 bytes/row as long as

- the column is NULL
- the column contains no more than 14 characters


HTH, XL
--
Axel Schwenke, Senior Software Developer, MySQL AB

Online User Manual: http://dev.mysql.com/doc/refman/5.0/en/
MySQL User Forums: http://forums.mysql.com/
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 07:10 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com