Unix Technical Forum

To SAN 4.3 or not, that's the question...

This is a discussion on To SAN 4.3 or not, that's the question... within the Sun Solaris Administration forums, part of the Solaris Operating System category; --> Do I want to install Suns StorEdge SAN 4.3 software on a Ultra 2/2300 runnings Solaris 9, connected to ...


Go Back   Unix Technical Forum > Unix Operating Systems > Solaris Operating System > Sun Solaris Administration

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 01-12-2008, 05:59 AM
Peter Eriksson
 
Posts: n/a
Default To SAN 4.3 or not, that's the question...


Do I want to install Suns StorEdge SAN 4.3 software on a
Ultra 2/2300 runnings Solaris 9, connected to a pair of
Sun A5000 boxes and a DEC/HP/Compaq HSG80 RAID controller
via three separate JNI64-1063 HBA controllers?

What will I gain by installing that package? Will things
be more stable?

Btw, there definitely seems to be a OS-crashing bug in Solaris 9
when some FC disks in the A5000 generate errors/fails - typically
the machine crashes after a series of warnings from the
disks - and this is not machine-related since we got exactly
the same behaviour when we tried to use Suns SOCAL-adapters in
a Sun Enterprise 3000 server...


BAD TRAP: type=31 rp=2a10004b060 addr=300aff3db48 mmu_fsr=0
sched:
trap type = 0x31
addr=0x300aff3db48
pid=0, pc=0x116e790, sp=0x2a10004a901, tstate=0x4480001600, context=0x0
g1-g7: 14ba800, 0, bb, 10, 30000305880, 10, 2a10004bd40
000002a10004ad90 unix:die+a4 (31, 2a10004b060, 300aff3db48, 0, 16, 0)
%l0-3: 0000000000000000 00000000aaab4fa4 000002a10004b060 000002a10004af58
%l4-7: 0000000000000031 0000030004f64ec8 0000000001171dd0 0000000001171d10
000002a10004ae70 unix:trap+874 (2a10004b060, 0, 10000, 10200, 300, 0)
%l0-3: 0000000000000001 0000000000000000 0000000001437b28 0000000000000031
%l4-7: 0000000000000006 0000000000000001 0000000000000000 0000000000000000
000002a10004afb0 unix:ktl0+48 (3, 2a10004bd40, 1490c00, 0, aabb50ec, 0)
%l0-3: 0000000000000001 0000000000001400 0000004480001600 000000000102bf54
%l4-7: 000003000000e930 000003000000e958 0000000000000005 000002a10004b060
000002a10004b100 SUNW,UltraSPARC-II:bcopy+1564 (aaab4fa4, 30005488ba4, 2, 10, 100c4ec, 0)
%l0-3: 0000000000000000 0000030001b8b728 0000030001b8b72c 0000030001b8b188
%l4-7: 0000030001ad3e70 0000030001b8b72c 0000000000000000 0000030001abc000
000002a10004b300 fcaw:fca_cmd_complete+428 (30005488060, 12, 30005488000, 30002235e18, 3000549d108, 30001abc000)
%l0-3: 0000030005488b98 0000000000000014 0000000000000012 0000000000000000
%l4-7: 0000000000000004 0000030005488060 0000000000010000 0000030004f18088
000002a10004b450 fcaw:fca_highintr+753c (3000027d238, b, 0, 1, c0000000, 30001abc000)
%l0-3: 0000030005488060 0000000000000028 0000000000017e3c 0000000000000028
%l4-7: 0000030001abc000 0000000000057f08 0000000000000000 0000000000000000
000002a10004ba90 sbus:sbus_intr_wrapper+28 (3000006a5e8, 7db, 1400000, 2a10004bd40, fb60, 11f5e28)
%l0-3: 0000000001219df8 00000300003d4618 0000000000000000 000000000142d7c0
%l4-7: 000003000005bbf8 0000000000000005 0000000000000000 0000030000210106
--
--
Peter Eriksson <peter@ifm.liu.se> Phone: +46 13 28 2786
Computer Systems Manager/BOFH Cell/GSM: +46 705 18 2786
Physics Department, Linköping University Room: Building F, F203
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 01-12-2008, 06:00 AM
Scott Howard
 
Posts: n/a
Default Re: To SAN 4.3 or not, that's the question...

Peter Eriksson <peter@ifm.liu.se> wrote:
> Btw, there definitely seems to be a OS-crashing bug in Solaris 9
> when some FC disks in the A5000 generate errors/fails - typically
> the machine crashes after a series of warnings from the


> BAD TRAP: type=31 rp=2a10004b060 addr=300aff3db48 mmu_fsr=0
> 000002a10004b300 fcaw:fca_cmd_complete+428 (30005488060, 12, 30005488000, 30002235e18, 3000549d108, 30001abc000)


FCAW is the 3rd party driver for the JNI cards. This isn't a Solaris
bug, it's a JNI bug. If you're running an old version of the JNI driver
I'd suggest upgrading - older version of this driver are very well known
for bringing machines down like this...

Scott
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 01-12-2008, 06:00 AM
Peter Eriksson
 
Posts: n/a
Default Re: To SAN 4.3 or not, that's the question...

Scott Howard <scott@hunterlink.net.au> writes:

>Peter Eriksson <peter@ifm.liu.se> wrote:
>> Btw, there definitely seems to be a OS-crashing bug in Solaris 9
>> when some FC disks in the A5000 generate errors/fails - typically
>> the machine crashes after a series of warnings from the


>> BAD TRAP: type=31 rp=2a10004b060 addr=300aff3db48 mmu_fsr=0
>> 000002a10004b300 fcaw:fca_cmd_complete+428 (30005488060, 12, 30005488000, 30002235e18, 3000549d108, 30001abc000)


>FCAW is the 3rd party driver for the JNI cards. This isn't a Solaris
>bug, it's a JNI bug. If you're running an old version of the JNI driver
>I'd suggest upgrading - older version of this driver are very well known
>for bringing machines down like this...


I *seriously* doubt that it is a JNI driver (we run the latest
driver version btw) problem since we used to have *exactly* the
same problem with Suns SOCAL cards on a different server when a
drive goes bad.

We've tried two different servers (Ultra 2 and Enterprise 3000).
Tried three different SOCAL cards. Tried different FC cables.

Replacing the bad drives with ones that work is one solution, but
that only removes the symptoms, doesn't solve the bug in Solaris (just
delays it until the next drive goes bad).

Mind you - it doesn't crash right away. It typically takes a number
of error messages printed and a reset or two of the FC bus before
Solaris 9 takes a dive.

(Things are slightly more stable with the JNI cards though so currently
the SOCAL boards are resting in a antistatic bag).

- Peter
--
--
Peter Eriksson <peter@ifm.liu.se> Phone: +46 13 28 2786
Computer Systems Manager/BOFH Cell/GSM: +46 705 18 2786
Physics Department, Linköping University Room: Building F, F203
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 01-12-2008, 06:01 AM
Scott Howard
 
Posts: n/a
Default Re: To SAN 4.3 or not, that's the question...

Peter Eriksson <peter@ifm.liu.se> wrote:
> I *seriously* doubt that it is a JNI driver (we run the latest


Having seen too many similar crashes where it was the JNI driver I'd
seriously doubt it's not, but...

If you have a support contact raise a call and give them the crash
dump. This will allow Sun to tell with certainty where the problem is.

> driver version btw) problem since we used to have *exactly* the
> same problem with Suns SOCAL cards on a different server when a
> drive goes bad.


It's not physically possible to have *exactly* the same problem with a
SOCAL card. The panic you've shown is occuring in the FCAW driver. If
you don't have this driver installed (which you wouldn't for a SOCAL
card) then the machine can't panic in this driver.

Scott.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 10:24 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com