How to setup a Linux system that can boot directly from a software RAID
Trang 1This article describes how to setup a Linux system that can boot directly from a software RAID-1 device using GRUB Even if one of the disks in the RAID array fails, the system can still boot The example uses Fedora Core 4.
System configuration
The test system is equipped with two 4GB IDE hard disks, one connected as master device on first IDE channel (/dev/hda), the other connected as master device on second IDE channel (/dev/ hdc) A DVD-ROM reader is connected as slave device to the first IDE channel (/dev/hdb).
The goal is to setup a RAID-1 device to mount as / and to boot from Another RAID-1 device will be used as swap, to have a fault tolerant page swapping.
RAID-1 device Mount point Bootable Software RAID devices
A real scenario just needs to provide a RAID-1 /dev/md0 that can be of any size (provided it is enough hold the linux installation) and composed of any software RAID-1 partitions (each
partition in the array should reside on a different physical disk, possibly connect to different IDE channels, to achieve maximum fault tolerance).
More complex configurations could include others RAID-1 devices to mount as /home, /usr, etc.
Installing and partitioning
Start the installation as usual booting from Fedora 4 DVD
1
Trang 2and proceed until the Disk Partitioning Setup page is reached, then select “Manually partition with Disk Druid”
First use the partitioning utility to create the software RAID partitions In the example both disks are split into a 3498Mb and a 596Mb software RAID partitions:
If the disks are new, the partitioning utility will ask to create a new partition table If the disks already contains data, make a backup if needed (all existing data of partitions involved in the process will be lost), and delete or resize existing partitions to create space for the software RAID partitions If the disks already contain software RAID partitions and they fit your needs, they can
be used The following screen shots depict the steps of the example:
2
Trang 5The next step is to setup the boot loader:
5
Trang 6Once the configuration installation options are provides, the installation of the system starts:
Notice that while the system is installing, the software RAID transparently initializes the RAID devices:
6
Trang 7Finally restart the system:
Enabling boot from both disks
7
Trang 8The installation of the system is now complete The system now reboots and it asks some more configuration options, like current time, screen resolution etc Then the system is ready to run.
The raid devices are up and running, and can be checked looking at /proc/mdstat and
using mdadm command Before setting up GRUB, the resync of the RAID devices must be completed, otherwise the procedure may work incorrectly, resulting in a system that would not boot at all.
8
Trang 9[root@fedora4 ~]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hdc2[1] hda2[0]
610368 blocks [2/2] [UU]
md0 : active raid1 hdc1[1] hda1[0]
3582336 blocks [2/2] [UU]
unused devices: <none>
[root@fedora4 ~]# mdadm detail /dev/md0
/dev/md0:
Version : 00.90.01
Creation Time : Sat Sep 24 19:52:04 2005
Raid Level : raid1
Array Size : 3582336 (3.42 GiB 3.67 GB)
Device Size : 3582336 (3.42 GiB 3.67 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sat Sep 24 18:42:53 2005
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : a1c044b8:d88be533:f570ac88:74e95f06
Events : 0.1984
Number Major Minor RaidDevice State
0 3 1 0 active sync /dev/hda1
1 22 1 1 active sync /dev/hdc1
[root@fedora4 ~]# mdadm examine /dev/hda1
/dev/hda1:
Magic : a92b4efc
Version : 00.90.01
UUID : a1c044b8:d88be533:f570ac88:74e95f06
Creation Time : Sat Sep 24 19:52:04 2005
Raid Level : raid1
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Sat Sep 24 18:43:13 2005
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 1473528a - correct
Events : 0.1994
Number Major Minor RaidDevice State
this 0 3 1 0 active sync /dev/hda1
0 0 3 1 0 active sync /dev/hda1
1 1 22 1 1 active sync /dev/hdc1
[root@fedora4 ~]# mdadm examine /dev/hdc1
/dev/hdc1:
9
Trang 10Magic : a92b4efc
Version : 00.90.01
UUID : a1c044b8:d88be533:f570ac88:74e95f06
Creation Time : Sat Sep 24 19:52:04 2005
Raid Level : raid1
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Sat Sep 24 18:43:28 2005
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 147352be - correct
Events : 0.2002
Number Major Minor RaidDevice State
this 1 22 1 1 active sync /dev/hdc1
0 0 3 1 0 active sync /dev/hda1
1 1 22 1 1 active sync /dev/hdc1
To be able to boot from the second disk if the first fails, GRUB has to be installed on both disks This cab be done with:
10
Trang 11[root@fedora4 ~]# grub
GNU GRUB version 0.95 (640K lower / 3072K upper memory)
[ Minimal BASH-like line editing is supported For the first word, TAB
lists possible command completions Anywhere else TAB lists the possible completions of a device/filename.]
grub> root (hd0,0)
Filesystem type is ext2fs, partition type 0xfd
grub> setup (hd0)
Checking if "/boot/grub/stage1" exists yes
Checking if "/boot/grub/stage2" exists yes
Checking if "/boot/grub/e2fs_stage1_5" exists yes
Running "embed /boot/grub/e2fs_stage1_5 (hd0)" 15 sectors are embedded succeeded
Running "install /boot/grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/boot/grub/stage2 /boot/grub/grub.conf" succeeded
Done
grub> root (hd1,0)
Filesystem type is ext2fs, partition type 0xfd
grub> setup (hd1)
Checking if "/boot/grub/stage1" exists yes
Checking if "/boot/grub/stage2" exists yes
Checking if "/boot/grub/e2fs_stage1_5" exists yes
Running "embed /boot/grub/e2fs_stage1_5 (hd1)" 15 sectors are embedded succeeded
Running "install /boot/grub/stage1 (hd1) (hd1)1+15 p (hd1,0)/boot/grub/stage2 /boot/grub/grub.conf" succeeded
Done
grub> quit
These command instruct GRUB to install on both the first partition of the first disk, and on the first partition of the second disk.
Testing the system
Now suppose the boot disk /dev/hda fails Simply shutting down the PC, removing the first disk and switching it on, let the system to start flawlessly! The /dev/md0 device will run in degraded mode There is no need to modify the /boot/grub/grub.conf as the survived disk is now hd0, Here
is what /proc/mdproc and mdadm reports when booting without the first disk:
11
Trang 12[root@fedora4 ~]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hdc2[1]
610368 blocks [2/1] [_U]
md0 : active raid1 hdc1[1]
3582336 blocks [2/1] [_U]
unused devices: <none>
[root@fedora4 giotex]# mdadm detail /dev/md0
/dev/md0:
Version : 00.90.01
Creation Time : Sat Sep 24 19:52:04 2005
Raid Level : raid1
Array Size : 3582336 (3.42 GiB 3.67 GB)
Device Size : 3582336 (3.42 GiB 3.67 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sat Sep 24 19:19:31 2005
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : a1c044b8:d88be533:f570ac88:74e95f06
Events : 0.2744
Number Major Minor RaidDevice State
0 0 0 - removed
1 22 1 1 active sync /dev/hdc1
The next step is to shutdown and connect the /dev/hdc disk as the master device on the first IDE channel, to make it the new boot disk It's also possible to directly replace /dev/hda with /dev/hdc
if the first fails The next boot will result in:
12
Trang 13[root@fedora4 ~]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hda2[1]
610368 blocks [2/1] [_U]
md0 : active raid1 hda1[1]
3582336 blocks [2/1] [_U]
unused devices: <none>
[root@fedora4 giotex]# mdadm detail /dev/md0
/dev/md0:
Version : 00.90.01
Creation Time : Sat Sep 24 19:52:04 2005
Raid Level : raid1
Array Size : 3582336 (3.42 GiB 3.67 GB)
Device Size : 3582336 (3.42 GiB 3.67 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sat Sep 24 20:57:49 2005
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : a1c044b8:d88be533:f570ac88:74e95f06
Events : 0.3236
Number Major Minor RaidDevice State
0 0 0 - removed
1 3 1 1 active sync /dev/hda1
Replacing the failed disk
When a new disk to replace the failed one is available it can be installed into the system,
partitioned to have the two software RAID partitions to replace the ones of the failed drive The new partitions can be added to the existing raid array with themdadm command:
13
Trang 14[root@fedora4 giotex]# fdisk -l /dev/hdc
Disk /dev/hdc: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/hdc1 1 446 3582463+ 83 Linux
/dev/hdc2 447 522 610470 83 Linux
[root@fedora4 giotex]# mdadm manage /dev/md0 add /dev/hdc1
mdadm: hot added /dev/hdc1
[root@fedora4 giotex]# mdadm manage /dev/md1 add /dev/hdc2
mdadm: hot added /dev/hdc2
[giotex@fedora4 ~]$ cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hdc2[2] hda2[1]
610368 blocks [2/1] [_U]
resync=DELAYED
md0 : active raid1 hdc1[2] hda1[1]
3582336 blocks [2/1] [_U]
[=> ] recovery = 5.6% (201216/3582336)
finish=5.3min speed=10590K/sec
unused devices: <none>
When the RAID array has finished to resync, run again GRUB to install the loader on the new partition (with root (hd1,0)and setup (hd1) ).
If you decide to use the tip described into this article, do it at your own risk, and please do your own tests Try to simulate disk failures, remove devices, do as much tests you can, to check if the system will actually survive disk failures, if it can be fault tolerant relative to the boot process, and
to train yourself, to be able to recover the system if a disk problem arises.
Also remember that using a RAID device does not mean you can stop doing regular backups!
Version: 1.0 Created: 2005-09-23 Modified: 2005-09-24
© Copyright 2005 texSoft.it This document is distribuited under the GNU Free Documentation License
This document is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE
All trademarks in the document belong to their respective owners.
14