vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Recently I became interested, - Are the data, bulk loaded in the table with LOAD utility, consume the same disk space as loaded with IMPORT utility? The answer turned out to be NOT ! Here is a nutshell description of the test. The testing was done at "DB2/LINUX 8.2.3". Tables for tests: F4106 has 5203 rows, 32 columns. F42199 has 1399252 rows, 245 columns. Load command: load client from '/home/share/tabXXXX.ixf' of ixf insert into proddta.fXXXX NONRECOVERABLE" Import command: import from '/home/share/tabXXXX.ixf' of ixf insert into proddta.fXXXX Between loads I used the following commands to truncate a table under investigation and clear statistics: ALTER TABLE PRODDTA.fXXXX ACTIVATE NOT LOGGED INITIALLY WITH EMPTY TABLE; RUNSTATS on table PRODDTA.fXXXX After load I used the same RUNSTATS as above to get the "used pages" counter (npages) in syscat.tables. Here are the results: syscat.tables, npages: ---------------------- TABLE IMPORT LOAD ------- --------- ------- F4106 372 401 F42199 694862 700326 One can see the disk space occupied by data, loaded with LOAD utility is slightly greater then its counterpart. If anybody understand this, please, explain. Cheers, -- Konstantin Andreev. |
| |||
| "Konstantin Andreev" <plafcow4odno@datatech.ru> wrote in message news:e8h0bi$o6u$1@dns.comcor.ru... > Recently I became interested, - Are the data, bulk loaded in the table > with LOAD utility, consume the same disk space as loaded with IMPORT > utility? The answer turned out to be NOT ! > > Here is a nutshell description of the test. The testing was done at > "DB2/LINUX 8.2.3". > > Tables for tests: > F4106 has 5203 rows, 32 columns. > F42199 has 1399252 rows, 245 columns. > > Load command: > load client from '/home/share/tabXXXX.ixf' of ixf insert into > proddta.fXXXX NONRECOVERABLE" > Import command: > import from '/home/share/tabXXXX.ixf' of ixf insert into proddta.fXXXX > > Between loads I used the following commands to truncate a table under > investigation and clear statistics: > > ALTER TABLE PRODDTA.fXXXX ACTIVATE NOT LOGGED INITIALLY WITH EMPTY > TABLE; > RUNSTATS on table PRODDTA.fXXXX > > After load I used the same RUNSTATS as above to get the "used pages" > counter (npages) in syscat.tables. > > Here are the results: > > syscat.tables, npages: > ---------------------- > TABLE IMPORT LOAD > ------- --------- ------- > F4106 372 401 > F42199 694862 700326 > > One can see the disk space occupied by data, loaded with LOAD utility is > slightly greater then its counterpart. > > If anybody understand this, please, explain. > > Cheers, > -- > Konstantin Andreev. > The load utility loads the data in blocks (or pages) in the same format they were exported, even if there are pages which are not completely full when the data is exported. This is done for reasons of speed and efficiency. The import utility processes the data by row and performs an insert for each row, so that it can use all the sequential space in target table without leaving any unused space on a page. |
| |||
| Mark A wrote: >> Here are the results: >> >> syscat.tables, npages: >> ---------------------- >> TABLE IMPORT LOAD >> ------- --------- ------- >> F4106 372 401 >> F42199 694862 700326 > The load utility loads the data in blocks (or pages) in the same format they were exported, even if there are pages which are not completely full when the data is exported. Can not be true. One reason and one confirmation: - The intermediate data format (DEL,IXF) intended to be interoperable. This allows moving data between different platforms and on-disk structures. It (data format) by definition does not contain page and block information. Thus LOAD operation must reconstruct any data blocks specifically for target platform. - I just checked - the source table F42199, when exported, occupied: npages=1399252, fpages=1399430. If you are right, LOAD'ed table would occupy the same number of pages, but it occupies just half of them. This is because VALUE COMPRESSION option for target table. Thus, the data pages for pages *were* reconstructed by LOAD. Cheers, -- Konstantin Andreev. |
| |||
| "Konstantin Andreev" <plafcow4odno@datatech.ru> wrote in message news:e8ina8$n6e$1@dns.comcor.ru... > > Can not be true. One reason and one confirmation: > > - The intermediate data format (DEL,IXF) intended to be interoperable. > This allows moving data between different platforms and on-disk > structures. It (data format) by definition does not contain page and block > information. Thus LOAD operation must reconstruct any data blocks > specifically for target platform. > > - I just checked - the source table F42199, when exported, occupied: > npages=1399252, fpages=1399430. If you are right, LOAD'ed table would > occupy the same number of pages, but it occupies just half of them. This > is because VALUE COMPRESSION option for target table. Thus, the data pages > for pages *were* reconstructed by LOAD. > > Cheers, > -- > Konstantin Andreev. Let me amend my response to be more accurate. The load utility loads data a page at a time. It takes the data from the input file and formats the pages to be loaded. New rows are not placed on existing pages. The import utility does regular SQL inserts, and therefore may use existing space on pages that already have some rows, but where the page is not full. |
| |||
| Mark A wrote: >> Thus LOAD operation must reconstruct any data blocks specifically for target platform. > Let me amend my response to be more accurate. > > The load utility loads data a page at a time. It takes the data from the input file and formats the pages to be loaded. New rows are not placed on existing pages. > > The import utility does regular SQL inserts, and therefore may use existing space on pages that already have some rows, but where the page is not full. Sounds reasonable. Let me a bit expand the proposed scenario, to check my understanding. - Some time the data row in sequence can't be fit on the currently constructed page, thus page fired to disk by LOAD utility and forgotten. Meanwhile among the rows to come could be encountered one, short enough to be placed on the fired page, but LOAD have to place it on the new page. I also expect that if the all rows have equal lengthes then page counts used by LOAD and IMPORT will also by equal. Please, correct me, if I flounder about. Thank you, -- Konstantin Andreev. |
| ||||
| "Konstantin Andreev" <plafcow4odno@datatech.ru> wrote in message news:e8l6sc$ehf$1@dns.comcor.ru... > Sounds reasonable. Let me a bit expand the proposed scenario, to check my > understanding. > > - Some time the data row in sequence can't be fit on the currently > constructed page, thus page fired to disk by LOAD utility and forgotten. > Meanwhile among the rows to come could be encountered one, short enough to > be placed on the fired page, but LOAD have to place it on the new page. I > also expect that if the all rows have equal lengthes then page counts used > by LOAD and IMPORT will also by equal. > > Please, correct me, if I flounder about. > > Thank you, > -- > Konstantin Andreev. Data loaded via the load utility is formatted into pages by the load utility an then stored in the table a page at a time. This is done outside of the normal SQL engine. Existing pages are not used for adding the data, and only new pages are created. It has nothing to do with whether rows will fit in existing pages, it is done that way for speed. Because the SQL engine is not used by the load utility, insert triggers will not fire for new rows added to the table. Imports are done by submitting regular inserts through the SQL engine and therefore the rows may end up being inserted on existing pages where there is space available. Insert triggers are fired, and all data is logged just like normal SQL. Therefore it is possible that the import uses less total space than load. |