Unix Technical Forum

RE: Long checkpoints on AIX/IDS7.31/IBM Shark

This is a discussion on RE: Long checkpoints on AIX/IDS7.31/IBM Shark within the Informix forums, part of the Database Server Software category; --> Thank you TBP, Dave & Alex for you suggestions, I have duly increased my LRUS to 128, and CLEANERS ...


Go Back   Unix Technical Forum > Database Server Software > Informix

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 09:14 PM
Willem Roos
 
Posts: n/a
Default RE: Long checkpoints on AIX/IDS7.31/IBM Shark



Thank you TBP, Dave & Alex for you suggestions,

I have duly increased my LRUS to 128, and CLEANERS & NUMAIOVPS after
looking at onstat -g iov and checking the io/wup column.

We've found the culprit which affects all hosts connected to the same
Shark (and there are quite a few) - it was a SQLServer box doing a large
delete. How come the Shark misbehaves so badly due to one particular
operation (ie. SQLServer delete, it's not volume related, the SAN is
quiet and the reports on the Shark don't show any high throughput) is
beyond me - and apparently beyond local support as well since we've
struggled with this since November last year and we've blamed everything
from the usual firmware thing to backups and what not (even found some
new entries for my BOFH DIY excuse board).

Sooo, you do something quite innocent on one box and you bring all
critical systems in your entire enterprise to a grinding halt, and
worse, you're not even able to tell. Now isn't that what mass storage is
all about :-...

Kind regards,

-------------------------------------------
Willem Roos - (+27) 21 980 4941
Per sercas vi malkovri

Disclaimer
http://www.shoprite.co.za/disclaimer.html

sending to informix-list
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 09:16 PM
Richard Kofler
 
Posts: n/a
Default Re: Long checkpoints on AIX/IDS7.31/IBM Shark

Willem Roos wrote:

[ ... SNIP ... ]

>
> We've found the culprit which affects all hosts connected to the same
> Shark (and there are quite a few) - it was a SQLServer box doing a large
> delete. How come the Shark misbehaves so badly due to one particular
> operation (ie. SQLServer delete, it's not volume related, the SAN is
> quiet and the reports on the Shark don't show any high throughput) is
> beyond me - and apparently beyond local support as well since we've


[ ... SNIP ... ]

> Sooo, you do something quite innocent on one box and you bring all
> critical systems in your entire enterprise to a grinding halt, and
> worse, you're not even able to tell. Now isn't that what mass storage is
> all about :-...


* caution: sark intermixed, as I still do not have the guts
to construct my monthly NO SAN with RAID5 posting *

I'd say this is what RAID5 in today SANs is all about.

To see this type of behavior it is not necessary at all
to use virtual or consolidated storage.....

The most expensive operation on a RAID5: to do many small
writes after reads, like a delete tends to do it.
Actually you'll see like only a few MB/second and
*think* this is not much.

BUT: if the unitsize or pagesize of the database
software doing the deletes is 4KB or even 8 KB
and every page holds like 50 records to be deleted, the
SAN must fetch 4KB worth of info from the disk (or must have it
already in the SAN buffer) to do the delete. The page must
be written back to buffer or disk 50 times.
In *all* the cases I know of, and this is at least 1 dozen or 3
the SAN buffer layout is way too small for all
the servers consolidated into 1 box and for all the disks
you need).

Let us assume now that the records to be deleted are nicely spread
over tons of 4 KB pages in respect of the sequence of deletion.

The only way to see from outside what happens in the up- and
downstage arena inside the SAN is to have monitoring tools
there or to be able to delete in the very same sequence as the
records are packed into pages.

Else you will not notice much traffic, not in the OS tools
nor on the fabric switched, because the deletes are slow,
but cause a lot of action inside the SAN box.

If not, exactly what you describe will happen on every
RAID5, be it standalone or in a SAN.

Search in this newsgroup for 'NO RAID5' postings/rants from
Art Kagel and you will read about more nasty lil details
that come along with RAID5.

dic_k
--
Richard Kofler
SOLID STATE EDV
Dienstleistungen GmbH
Vienna/Austria/Europe
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 03:51 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com