Unix Technical Forum

Multiprocessor questions

This is a discussion on Multiprocessor questions within the Sun Solaris Hardware forums, part of the Solaris Operating System category; --> Hi, I have some questions regarding multiprocessor systems, and specially Sun ones. Don't be tough, I don't know much ...


Go Back   Unix Technical Forum > Unix Operating Systems > Solaris Operating System > Sun Solaris Hardware

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 01-16-2008, 12:31 PM
Joe Smith
 
Posts: n/a
Default Multiprocessor questions

Hi,

I have some questions regarding multiprocessor systems, and specially Sun
ones. Don't be tough, I don't know much about the subject.

I think I know some of the answers, but I'd like to have them confirmed...

In a n-processor system, is there a possibility that, even if one processor
stops working or is damaged, the system continues on?
(I'm betting no)

If the damaged processor is removed, would the n-1 system continue working
after reboot?
(I believe so, perhaps after some jumper modification??)

I guess my question is if there is some microarchitecture similar to a load
balancer, that would ask if the processor is active, and if so, use it...

Any link to the subject is very welcome

Thanks!


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 01-16-2008, 12:31 PM
Paul S. Brown
 
Posts: n/a
Default Re: Multiprocessor questions

Joe Smith wrote:

> Hi,
>
> I have some questions regarding multiprocessor systems, and specially Sun
> ones. Don't be tough, I don't know much about the subject.
>
> I think I know some of the answers, but I'd like to have them confirmed...
>
> In a n-processor system, is there a possibility that, even if one
> processor stops working or is damaged, the system continues on?
> (I'm betting no)
>


Yes, dependant on the particular class of hardware, however any processes
running on that particular CPU will die.

> If the damaged processor is removed, would the n-1 system continue working
> after reboot?
> (I believe so, perhaps after some jumper modification??)


Again, yes dependant on the system design. The answer here is "probably"

>
> I guess my question is if there is some microarchitecture similar to a
> load balancer, that would ask if the processor is active, and if so, use
> it...


Again, system class dependant.

P.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 01-16-2008, 12:31 PM
Rich Teer
 
Posts: n/a
Default Re: Multiprocessor questions

On Thu, 20 Nov 2003, Joe Smith wrote:

> I have some questions regarding multiprocessor systems, and specially Sun
> ones. Don't be tough, I don't know much about the subject.


I'll answer these questions purely from a Sun HW persepctive.
YMMV for other brands of HW.

> In a n-processor system, is there a possibility that, even if one processor
> stops working or is damaged, the system continues on?
> (I'm betting no)


Yes the system will continue, although how well depends on
which HW range you're talking about. Entry level MP machines
probably will stop, but higher end machine will keep going.
The processes running on the dead CPU will probably die, though.

> If the damaged processor is removed, would the n-1 system continue working
> after reboot?
> (I believe so, perhaps after some jumper modification??)


Yes, no jumper mods required.

> I guess my question is if there is some microarchitecture similar to a load
> balancer, that would ask if the processor is active, and if so, use it...


That's what the kernel does!

--
Rich Teer, SCNA, SCSA

President,
Rite Online Inc.

Voice: +1 (250) 979-1638
URL: http://www.rite-online.net
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 01-16-2008, 12:31 PM
Axel Neumann
 
Posts: n/a
Default Re: Multiprocessor questions


"Rich Teer" <rich.teer@rite-group.com> wrote:
> On Thu, 20 Nov 2003, Joe Smith wrote:
>
> > I have some questions regarding multiprocessor systems, and specially

Sun
> > ones. Don't be tough, I don't know much about the subject.

>
> I'll answer these questions purely from a Sun HW persepctive.
> YMMV for other brands of HW.
>
> > In a n-processor system, is there a possibility that, even if one

processor
> > stops working or is damaged, the system continues on?
> > (I'm betting no)

>
> Yes the system will continue, although how well depends on
> which HW range you're talking about. Entry level MP machines
> probably will stop, but higher end machine will keep going.
> The processes running on the dead CPU will probably die, though.
>

[.. snip ..]

Hi,

Any Sun system will crash if a CPU is dying, that is due the UNIX operating
system. After the crash it will reboot and continue working if at least one
working CPU is possible.

Of course this means that only a domain will crash if this will happen in a
multi domain system like the F15K.

HTH,

Axel Neumann


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 01-16-2008, 12:31 PM
Glenn
 
Posts: n/a
Default Re: Multiprocessor questions

> + On 20-Nov-03 15:28:27
+Joe Smith <nospam@nospam.com> wrote

>I have some questions regarding multiprocessor systems, and specially Sun
>ones. Don't be tough, I don't know much about the subject.


>I think I know some of the answers, but I'd like to have them confirmed...


>In a n-processor system, is there a possibility that, even if one processor
>stops working or is damaged, the system continues on?
>(I'm betting no)


I guess it depends on if any operating system critic tasks is running
on that particular CPU, at least with Sun HW and Solaris.. however I
never had a CPU fail except during boot.

More expensive system (Mainly the Enterprise systems) can even let the
technichan replace CPU-boards without rebooting the machine.. (the same
goes for RAM,disks and PSU's ofcoz..)

>If the damaged processor is removed, would the n-1 system continue working
>after reboot?
>(I believe so, perhaps after some jumper modification??)


Yes, usally you dont even have to remove the CPU-board, the machine
will detect it as bad and then just not use it.

>I guess my question is if there is some microarchitecture similar to a load
>balancer, that would ask if the processor is active, and if so, use it...


I dont know how it works "inside", but yes it probably works in some similar
way...

I just had to fire up my SS10 for fun (normally shutdown since it heat up
the room alot if its running..)

[glenn@hydra glenn]$ psrinfo
0 on-line since 11/22/03 00:05:31
1 on-line since 11/22/03 00:05:36
2 on-line since 11/22/03 00:05:36
3 on-line since 11/22/03 00:05:36
[glenn@hydra glenn]$ uname -a
SunOS hydra 5.8 Generic_108528-18 sun4m sparc SUNW,SPARCstation-10
[glenn@hydra glenn]$

(Yes, I know that it's badly patched, but as I said, it been turned off
during all the summer since it get so warm, I actually patching it right now

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 01-16-2008, 12:39 PM
Paul Eggert
 
Posts: n/a
Default fault-tolerant SPARC

"Axel Neumann" <Axel.Neumann@epost.de> writes:

> Any Sun system will crash if a CPU is dying, that is due the UNIX
> operating system.


Some CPU faults are caught on some Sun models, and can be fixed
without crashing the overall system.

However, if you really want a system with no single point of failure,
then you need a fault-tolerant system; e.g., pairs of CPUs operating
in lockstep such that if any single CPU fails, the error will be
caught right away. Sun doesn't make boxes like that as far as I know;
they bought a fault-tolerant SPARC company in the late 1990s, produced
a Netra ft 1800 box, and they still have a web page on the topic
<http://www.sun.com/servers/ft-sparc/> but I haven't heard much about
fault-tolerant SPARC from Sun lately.

Resilience Corp. also put out a fault-tolerant SPARC Solaris box in
the 1990s, but they've switched to GNU/Linux in their current
products. (Their current boxes are not fault-tolerant -- merely
"integrated high availability", which is good enough for most users.)

There's also the LEON-FT SPARC V8 core used by the European Space
Agency for use in long space missions: it's fault-tolerant, but it's
not a Sun design. See <http://www.gaisler.com/>.

Fault-tolerant boxes tend to be fairly expensive. The cheapest
commercial box that I know of is the NEC Express 5800/ft, a Xeon-based
box that runs GNU/Linux and Windows Server 2003 (but not Solaris x86,
unfortunately).
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 08:55 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com