This is a discussion on DB2 physical layout - comments needed within the DB2 forums, part of the Database Server Software category; --> DB2 500G database, Wintel, heavily loaded OLTP (5M+ transactions a day; all transactions are extremely small, all selects are ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| DB2 500G database, Wintel, heavily loaded OLTP (5M+ transactions a day; all transactions are extremely small, all selects are controlled (no ad-hoc), 99% of all selects are very small (no table scans, index scans are very limited in size) ). Write performance is generally more important than read performance, read performance of heavy queries (the ones with table scans) is not important at all. Question: how to spread data across physical disks to achieve maximum performance? Proposed layout: logs on separate mirror, system on separate mirror, tempspace on separate mirror (?), all other disks are 'data disks' organized as mirrors - see below. Tables are divided into 'small' ones and 'big' ones. All small ones sit in one tablespace that has associated bufferpool equal to tablespace size (that should ensure 99.9999...% hit ratio). Big (historical) ones have separate tablespaces for data and indexes, with separate bufferpools. All tablespaces (both big and small) are spread over all 'data disks' with each tablespace having one container of equal size on each disk. Important: as load is 99.9% random access opposed to sequential access, to reduce contention at disk level it is proposed to have PREFETCHSIZE exactly equal to EXTENTSIZE (!). As a result of this configuration, each read operation is expected to affect exactly one disk without touching neighbours; due to equal spreading of tablespaces across all disks, load on every disk will be statistically balanced. Any comments (especially with explanations what seems to be wrong) will be appreciated. |
| |||
| "Nobody" <ipsign123@yahoo.com> wrote in message news:e5eda561.0402131632.443f7751@posting.google.c om... > DB2 500G database, Wintel, heavily loaded OLTP (5M+ transactions a day; > all transactions are extremely small, all selects are controlled > (no ad-hoc), 99% of all selects are very small (no table > scans, index scans are very limited in size) ). Write performance > is generally more important than read performance, read performance > of heavy queries (the ones with table scans) is not important at all. > > Question: how to spread data across physical disks to achieve > maximum performance? > > Proposed layout: logs on separate mirror, system on separate > mirror, tempspace on separate mirror (?), all other disks > are 'data disks' organized as mirrors - see below. Tables are > divided into 'small' ones and 'big' ones. All small ones sit in > one tablespace that has associated bufferpool equal to tablespace > size (that should ensure 99.9999...% hit ratio). Big (historical) > ones have separate tablespaces for data and indexes, with > separate bufferpools. All tablespaces (both big and small) > are spread over all 'data disks' with each tablespace having > one container of equal size on each disk. Important: as > load is 99.9% random access opposed to sequential access, > to reduce contention at disk level it is proposed to have > PREFETCHSIZE exactly equal to EXTENTSIZE (!). As a result of this > configuration, each read operation is expected to affect exactly > one disk without touching neighbours; due to equal spreading > of tablespaces across all disks, load on every disk will > be statistically balanced. > > Any comments (especially with explanations what seems to be wrong) > will be appreciated. Given your application description, I would not expect much (if any) prefetch activity to be happening. So I am not sure that I would worry about making PREFETCHSIZE exactly equal to EXTENTSIZE. Prefetch is usually limited to tablespace or index space scans. Given the amount of buffer pool space for the "small" tables, you don't need to worry much about data placement on the disk. But I would try to put indexes on different disks than tables. |
| |||
| "Mark A" <ma@switchboard.net> wrote in message news:<GxeXb.491$dq2.98728@news.uswest.net>... Thanks for the answer, Mark. > Given your application description, I would not expect much (if any) > prefetch activity to be happening. So I am not sure that I would worry about > making PREFETCHSIZE exactly equal to EXTENTSIZE. Prefetch is usually limited > to tablespace or index space scans. That was exactly the reasoning behind this decision. > Given the amount of buffer pool space for the "small" tables, you don't need > to worry much about data placement on the disk. But I would try to put > indexes on different disks than tables. That was another design decision we needed to make: either to allocate disks for specific tables/indexes, or to spread all the tables/indexes over all disks. We decided in favor of the latter because it provides much better balancing. Or you are proposing to have 2 sets of disks - one for all indexes (with all indexes spread over all disks of this set), and one for data (with all data spread over all disks of this set)? If yes - why do you this it is better then our model with only one set of disks for both indexes/data? |
| |||
| > > Given the amount of buffer pool space for the "small" tables, you don't need > > to worry much about data placement on the disk. But I would try to put > > indexes on different disks than tables. > That was another design decision we needed to make: either to > allocate disks for specific tables/indexes, or to spread all > the tables/indexes over all disks. We decided in favor of the > latter because it provides much better balancing. Or you are > proposing to have 2 sets of disks - one for all indexes (with > all indexes spread over all disks of this set), and one > for data (with all data spread over all disks of this set)? > If yes - why do you this it is better then our model with > only one set of disks for both indexes/data? You don't need to have all indexes on one set of disks and tables on another set. But if you could ensure that for a given table, the table data was on a separate disk from the indexes for that table, that would be fine. But I would not overkill this in situations were the bufferpool is as large as the data. |
| |||
| ak_tiredofspam@yahoo.com (AK) wrote in message news:<46e627da.0402140915.173680b6@posting.google. com>... > > mirror, tempspace on separate mirror (?), all other disks > I was wondering if there is any need for mirroring temporary data. To prevent downtime in case of single HDD failure. |
| |||
| "Mark A" <ma@switchboard.net> wrote in message news:<5JnXb.2$n53.4608@news.uswest.net>... > > > Given the amount of buffer pool space for the "small" tables, you don't > need > > > to worry much about data placement on the disk. But I would try to put > > > indexes on different disks than tables. > > That was another design decision we needed to make: either to > > allocate disks for specific tables/indexes, or to spread all > > the tables/indexes over all disks. We decided in favor of the > > latter because it provides much better balancing. Or you are > > proposing to have 2 sets of disks - one for all indexes (with > > all indexes spread over all disks of this set), and one > > for data (with all data spread over all disks of this set)? > > If yes - why do you this it is better then our model with > > only one set of disks for both indexes/data? > > You don't need to have all indexes on one set of disks and tables on another > set. But if you could ensure that for a given table, the table data was on a > separate disk from the indexes for that table, that would be fine. But I > would not overkill this in situations were the bufferpool is as large as the > data. You're right - this shouldn't matter for 'small' tables. But what about 'large' tables - do you think it's better to separate indexes and data (given that access in 99.9% random anyway)? If yes, how would you organize disks - 2 separate disks for each 'large' table (one for data, one for indexes), or just two sets of disks (one with data for all the tables, another for indexes for all the tables)? And the most important question - why? Thx in advance |
| |||
| > You're right - this shouldn't matter for 'small' tables. But what about > 'large' tables - do you think it's better to separate indexes and data > (given that access in 99.9% random anyway)? If yes, how would you organize > disks - 2 separate disks for each 'large' table (one for data, one for indexes), > or just two sets of disks (one with data for all the tables, another for > indexes for all the tables)? And the most important question - why? > > Thx in advance I am not sure if understand the options above, but here is what I originally meant to say: For a given table, the tablespace for the table and the tablespace for the indexes should be on different disks (or different arrays). Please not that the above applies to a PARTICULAR table. That does not mean that all tables should be on one disk, and all indexes on another disk. For example, 2 tables A and B, each with 1 index AX and BX: Disk 1 - tablespace for A, tablespace for BX Disk 2 - tablespace for B, tablespace for AX The reason for this is that for an OLTP system, the index data and corresponding table data are accessed almost simultaneously, so it could theoretically help to have them on separate disks for a given table. Whether or the difference is enough to worry about depends on many factors. But for your large tables, I would have multiple containers on different disks for each tablespace . Even if you don't have table space scans in your application, this will help speed up utilities that access the entire table. |
| |||
| First off, and most importantly: Use hardware RAID-1 (based on Ultra-320 LVD SCSI technology) rather than software mirroring, and make sure your RAID hardware has a battery backed write cache (we got 7x DB2 performance improvement merely by upgradinging the RAID card to one with a battery). Also follow the advice in the DB2 documentation under "Optimizing table space performance when data is on RAID devices" (this also addresses your EXTENTSIZE and PREFETCHSIZE questions). Next: ensure you have adequate RAM. Measure bufferpool hit cache rate, and add RAM (whilst increasing buffer pool sizes) until you've reduced total I/O to the minimum. BTW, be careful of increasing buffer pools too high for the available RAM ==> measure system I/O and swapping. Finally: spread the disk load over all disks. Enable disk performance counters if you haven't already done so (enter the diskperf command at a command prompt to see the current state; enable with diskperf -Y). Then measure disk I/O, and move tablespaces around until you you have a nicely balanced system. "Statistically balanced" (as you put it) is a meaningless concept. The only way to know if you have balanced access is to measure it. I have yet to see an a-priori analysis that bore any resemblance to measured reality. I'm still a believer in the old "1 table to a tablespace" rule, and have found that increasing the number of bufferpools normally proportionally decreases total throughput, hence I generally keep the number of bufferpools to a minimum. Given your transaction load, I would probably not enable parallel query. Spreading tablespaces over multiple volumes is important for large tablespaces, but unless you database design consists of less large tables than physical disks, generally can increase overhead rather than reducing it. At 500GB data, and 73GB as a standard SCSI disk size (meaning you have 7 RAID-0 pairs), you probably don't have such a configuration. You claim that access is 99.9% random rather than "sequential". This is highly unusual: normally access is localized to specific "hot spots" and is far from uniformly random. I strongly recommend your measuring your system to verify that access is truly random. If, as I suspect, it isn't, you can tune for the effective access pattern. "Nobody" <ipsign123@yahoo.com> wrote in message news:e5eda561.0402131632.443f7751@posting.google.c om... > DB2 500G database, Wintel, heavily loaded OLTP (5M+ transactions a day; > all transactions are extremely small, all selects are controlled > (no ad-hoc), 99% of all selects are very small (no table > scans, index scans are very limited in size) ). Write performance > is generally more important than read performance, read performance > of heavy queries (the ones with table scans) is not important at all. > > Question: how to spread data across physical disks to achieve > maximum performance? > > Proposed layout: logs on separate mirror, system on separate > mirror, tempspace on separate mirror (?), all other disks > are 'data disks' organized as mirrors - see below. Tables are > divided into 'small' ones and 'big' ones. All small ones sit in > one tablespace that has associated bufferpool equal to tablespace > size (that should ensure 99.9999...% hit ratio). Big (historical) > ones have separate tablespaces for data and indexes, with > separate bufferpools. All tablespaces (both big and small) > are spread over all 'data disks' with each tablespace having > one container of equal size on each disk. Important: as > load is 99.9% random access opposed to sequential access, > to reduce contention at disk level it is proposed to have > PREFETCHSIZE exactly equal to EXTENTSIZE (!). As a result of this > configuration, each read operation is expected to affect exactly > one disk without touching neighbours; due to equal spreading > of tablespaces across all disks, load on every disk will > be statistically balanced. > > Any comments (especially with explanations what seems to be wrong) > will be appreciated. |
| ||||
| > For a given table, the tablespace for the table and the tablespace for the > indexes should be on different disks (or different arrays). Please not that > the above applies to a PARTICULAR table. That does not mean that all tables > should be on one disk, and all indexes on another disk. > > For example, 2 tables A and B, each with 1 index AX and BX: > > Disk 1 - tablespace for A, tablespace for BX > Disk 2 - tablespace for B, tablespace for AX I see. Thanks. > The reason for this is that for an OLTP system, the index data and > corresponding table data are accessed almost simultaneously, so it could > theoretically help to have them on separate disks for a given table. Because otherwise normal request for A will require accessing A and AX, which would mean moving HDD head back and forth, which is not a good thing, right? But on the other hand, IO is supposed to be random, so head re-positioning will likely happen anyway. Plus if we will have the following configuration ('our' model): Disk 1 - A-1 (1st container for A data), AX-1, B-1, BX-1, C-1, CX-1 Disk 2 - A-2, AX-2, B-2, BX-2, C-2, CX-2 Disk 3 - A-3, AX-3, B-3, BX-3, C-3, CX-3 Disk 4 - A-4, AX-4, B-4, BX-4, C-4, CX-4 Disk 5 - A-5, AX-5, B-5, BX-5, C-5, CX-5, then random IO for index and data will likely (80% chance) go to the different disks. And A-B join (and any other join) it is going to have the same 80% chance of any 2 requests going to different disks. I don't want to say that your model is not going to work - it will, but our model seems (to us) to be easier manageable and expandable, plus it is balanced 'by design' (all the disks has statistically equal load, so no overload of one of the disks is going to happen), and we do like it, but it is not usually recommended for high-performance systems, so we are trying to find any potential problems with it. And the question is - do you see any problems with 'our' model? Thanks in advance > Whether > or the difference is enough to worry about depends on many factors. > > But for your large tables, I would have multiple containers on different > disks for each tablespace . Even if you don't have table space scans in your > application, this will help speed up utilities that access the entire table. |