Unix Technical Forum

Problem with raid boot

This is a discussion on Problem with raid boot within the Linux Operating System forums, part of the Unix Operating Systems category; --> On Mon, 31 Oct 2005 21:52:02 +0100, Connor T <madman_dan@hotmail.com> wrote: > Ok, many apolologies for taking so long ...


Go Back   Unix Technical Forum > Unix Operating Systems > Linux Operating System

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #11 (permalink)  
Old 01-18-2008, 09:58 AM
Enrique Perez-Terron
 
Posts: n/a
Default Re: Problem with raid boot

On Mon, 31 Oct 2005 21:52:02 +0100, Connor T <madman_dan@hotmail.com> wrote:

> Ok, many apolologies for taking so long to come back to this, but i've
> just tried this, got lilo installed, and it seems to update on both
> hdd's when i run lilo -v.
>
> So, I have /dev/md0 happily running. I turned the pc off, disconnected
> the primary hdd, and powered back on. The system came up, but LILO
> only got to the LI stage, which is apparently something to do with
> drive geometry differences? Admittedly the drives are _not_ identical.


What matters, should be that a) md is not running as such before the kernel
has been loaded. Booting off mirrored md volumes relies on each member
being usable as separately; and b) whatever is in the boot sector (master boot
record, mbr, sector zero) actually makes the Bios fetch the right sectors
from the disk.

Lilo places a list of sector numbers and sector number ranges in the
boot sector, with a routine to step through these, sending each sector
number to the BIOS. If the sector numbers are the same on both disks
you should be fine.

Well, that is the theory, I don't really know. What happens if the
bios does not support sector number addressing, but wants cylinder/
head/sector addressing, and the two drives have different geometries?

I tend to think that all modern BIOSes support the sector addressing
versions of the INT 0x13 calls, but if LBA has not been turned on
in the BIOS setup for the drive, the BIOS will translate back to c/h/s
addressing in accordance with the drive's geometry. This should not
harm. But what if Lilo uses the older c/h/s addressing versions of
INT 0x13 calls? Then the addresses for one disk cannot be used with
a different disk with a different geometry.

Another requirement is that the two boot partitions must be placed
identically on each drive. It should not matter if one drive is
larger and have more partitions, as long and the relevant partitions
both start in sector number X relative to the start of the disk.

Another point: Md members are usually partitions. The first track
of each disk is usually not part of any partition. You can have mirrored
copies of the kernel in each member, but when running lilo, only
the current primary master ide disk (or whatever disk you request)
is written to. You will have to repeat the lilo installation for
each disk.

When booting, the BIOS will use the first disk it finds. (Well, that
obviously depends on the Bios. If your bios allows you to specify
the disks individually in the boot order, you have more flexibility.
But when Lilo calls the bios to read in the kernel, is specifies
the drive number in register cl (IIRC). Whereever Lilo takes the
number from, it could be the wrong one. If the Bios renumbers the
drives when one is missing or failing, then you may need to have
lilo use the same disk number in both cases.

This should be enough to point out a couple of points to check


> Michael Heiming wrote:
>> In comp.os.linux.setup Michael Heiming <michael+USENET@www.heiming.de>:
>> > In comp.os.linux.setup Peter T. Breuer <ptb@oboe.it.uc3m.es>:
>> >> Connor T <madman_dan@hotmail.com> wrote:

>> [..]
>>
>> >>> ( Centos is a RHEL clone )

>>
>> [..]
>>
>> >> In the obvious way! I believe lilo has an option to do that
>> >> automatically, or does do it autmatically.

>>
>> > Yep, just point the loader to the md device containing "/" and
>> > rerun 'lilo -v', you should see it writing on both disks and
>> > both will be perfectly bootable, at least with RHEL.


See this in relation to the issues I mentioned above, and there are
a few things that are not clear enough here. Ideally it should be
possible to build sufficient intelligence into lilo so it always
does exactly the right thing, but it seems that Connor's experience
shows that this is not quite the case.

>> In addition, look at this example of a working softraid 1
>> configuration:
>>
>> Presuming "root=/dev/md5", put in lilo.conf:
>>
>> boot=/dev/md5
>>
>> Now run 'lilo -v':
>>
>> # lilo -v
>> LILO version 21.4-4, Copyright (C) 1992-1998 Werner Almesberger
>> 'lba32' extensions Copyright (C) 1999,2000 John Coffman
>>
>> boot = /dev/sdb, map = /boot/map.0811
>> Reading boot sector from /dev/sdb
>> Merging with /boot/boot.b
>> [..]
>> /boot/boot.0810 exists - no backup copy made.
>> Writing boot sector.
>> boot = /dev/sda, map = /boot/map.0801
>> Reading boot sector from /dev/sda
>> Merging with /boot/boot.b
>> [..]
>>
>> As you can see, lilo happily writes the boot sector to both disks
>> and the system can boot from both. This example uses SCSI disks,
>> but that doesn't matter, it works as fine with IDE. Just be aware
>> that not all distro allow this out of the box.


Very nice, but we don't know yet what the preconditions are.

-Enrique
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 01-18-2008, 09:58 AM
Michael Heiming
 
Posts: n/a
Default Re: Problem with raid boot

In comp.os.linux.setup Connor T <madman_dan@hotmail.com>:
> Michael Heiming wrote:
>> In comp.os.linux.setup Michael Heiming <michael+USENET@www.heiming.de>:
>> > In comp.os.linux.setup Peter T. Breuer <ptb@oboe.it.uc3m.es>:
>> >> Connor T <madman_dan@hotmail.com> wrote:

>> [..]
>>
>> >>> ( Centos is a RHEL clone )

>>
>> [..]
>>
>> >> In the obvious way! I believe lilo has an option to do that
>> >> automatically, or does do it autmatically.

>>
>> > Yep, just point the loader to the md device containing "/" and
>> > rerun 'lilo -v', you should see it writing on both disks and
>> > both will be perfectly bootable, at least with RHEL.

>>
>> In addition, look at this example of a working softraid 1
>> configuration:
>>
>> Presuming "root=/dev/md5", put in lilo.conf:
>>
>> boot=/dev/md5
>>
>> Now run 'lilo -v':
>>
>> # lilo -v
>> LILO version 21.4-4, Copyright (C) 1992-1998 Werner Almesberger
>> 'lba32' extensions Copyright (C) 1999,2000 John Coffman
>>
>> boot = /dev/sdb, map = /boot/map.0811
>> Reading boot sector from /dev/sdb
>> Merging with /boot/boot.b
>> [..]
>> /boot/boot.0810 exists - no backup copy made.
>> Writing boot sector.
>> boot = /dev/sda, map = /boot/map.0801
>> Reading boot sector from /dev/sda
>> Merging with /boot/boot.b
>> [..]
>>
>> As you can see, lilo happily writes the boot sector to both disks
>> and the system can boot from both. This example uses SCSI disks,
>> but that doesn't matter, it works as fine with IDE. Just be aware
>> that not all distro allow this out of the box.


> Ok, many apolologies for taking so long to come back to this, but i've
> just tried this, got lilo installed, and it seems to update on both
> hdd's when i run lilo -v.


> So, I have /dev/md0 happily running. I turned the pc off, disconnected
> the primary hdd, and powered back on. The system came up, but LILO
> only got to the LI stage, which is apparently something to do with
> drive geometry differences? Admittedly the drives are _not_ identical.


Which is likely the problem, disks in my example are exactly
identical, same type and same make. Sorry if this wasn't obvious,
this way softraid is easy to handle since you can copy
partitioning from one disk to another, while replacing a broken
disk or even clone a system using one mirror disk.

Wouldn't even think about using different disks, so the outcome
is unclear to me, but I suspect problems as you are facing. Using
raid mirroring on a larger bunch of systems, mostly hardware raid
controller of different make and some with softraid without any
problems. Mirrored disks are always the same type/make, this is a
precaution, which I perhaps (the thread has already vanished from
my spool) didn't point out clearly enough, probably because it's
obvious that anything else doesn't make much sense.

--
Michael Heiming (X-PGP-Sig > GPG-Key ID: EDD27B94)
mail: echo zvpunry@urvzvat.qr | perl -pe 'y/a-z/n-za-m/'
#bofh excuse 110: The rolling stones concert down the road
caused a brown out
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 12:09 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com