Replace Failed SVM Mirror Drive

So you have used SVM to mirror your disk, and one of the two drives fails. Aren’t you glad you mirrored them! You don’t have to do a restore from tape, but you are going have to replace the failed drive.

Many modern RAID arrays just require you to take out the bad drive and plug in the new one, while everything else is taken care of automatically. It’s not quite that easy on a Sun server, but it’s really just a few simple steps. I just had to do this, so I thought I would write down the procedure here.

Basically, the process boils down to the following steps:

  • Detach the failed meta devices from the failed drive
  • Delete the meta devices from the failed drive
  • Delete the meta databases from the failed drive
  • Unconfigure the failed drive
  • Remove and replace the failed drive
  • Configure the new drive
  • Copy the remaining drive’s partition table to the new drive
  • Re-create the meta databases on the new drive
  • Install the bootblocks on the new drive
  • Recreate the meta devices
  • Attach the meta devices

Let’s look at each step individually. In my case, c0t1d0 has failed, so, I detach all meta devices on that disk and then delete them:


# metadetach -f d0 d2
# metadetach -f d10 d12
# metadetach -f d40 d42
# metaclear d2
# metaclear d12
# metaclear d42

Next I take a look at the status of my meta databases. Below we can see the the replicas on that disk have write errors:

# metadb -i
        flags           first blk       block count
     a m  p  luo        16               8192            /dev/dsk/c0t0d0s3
     a    p  luo        8208             8192            /dev/dsk/c0t0d0s3
     W    p  luo        16                8192            /dev/dsk/c0t1d0s3
     W    p  luo        8208            8192            /dev/dsk/c0t1d0s3
 r - replica does not have device relocation information
 o - replica active prior to last mddb configuration change
 u - replica is up to date
 l - locator for this replica was read successfully
 c - replica's location was in /etc/lvm/mddb.cf
 p - replica's location was patched in kernel
 m - replica is master, this is replica selected as input
 W - replica has device write errors
 a - replica is active, commits are occurring to this replica
 M - replica had problem with master blocks
 D - replica had problem with data blocks
 F - replica had format problems
 S - replica is too small to hold current data base
 R - replica had device read errors

The replicas on c0t1d0s3 are dead to us, so let’s wipe them out!


# metadb -d c0t1d0s3
# metadb -i

        flags           first blk       block count
     a m  p  luo        16               8192            /dev/dsk/c0t0d0s3
     a    p  luo        8208             8192            /dev/dsk/c0t0d0s3

The only replicas we have left are on c0t0d0s3, so I’m all clear to unconfigure the device. I run cfgadm to get the c0 path:


# cfgadm -al

Ap_Id                          Type         Receptacle   Occupant     Condition
c0                             scsi-bus     connected    configured   unknown
c0::dsk/c0t0d0                 disk         connected    configured   unknown
c0::dsk/c0t1d0                 disk         connected    configured   unknown
c0::dsk/c0t2d0                 disk         connected    configured   unknown
c0::dsk/c0t3d0                 disk         connected    configured   unknown
c1                             scsi-bus     connected    configured   unknown
c1::dsk/c1t0d0                 CD-ROM       connected    configured   unknown
usb0/1                         unknown      empty        unconfigured ok
usb0/2                         unknown      empty        unconfigured ok
usb1/1.1                       unknown      empty        unconfigured ok
usb1/1.2                       unknown      empty        unconfigured ok
usb1/1.3                       unknown      empty        unconfigured ok
usb1/1.4                       unknown      empty        unconfigured ok
usb1/2                         unknown      empty        unconfigured ok

I run the following command to unconfigure the failed drive:


# cfgadm -c unconfigure c0::dsk/c0t1d0

The drive light turns blue
Pull the failed drive out
Insert the new drive

Configure the new drive:


# cfgadm -c configure c0::dsk/c0t1d0

Now that the drive is configured and visible from within the format command, we can copy the partition table from the remaining mirror member:


# prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c0t1d0s2

Next, I install the bootblocks onto the new drive:


# installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t1d0s2

Create the state replicas:


metadb -a -c 2 c0t1d0s3

Recreate the meta devices:

metainit -f d2 1 1 c0t1d0s0
metainit -f d12 1 1 c0t1d0s1
metainit -f d42 1 1 c0t1d0s4

And finally, reattach the metadevices which will sync them up with the mirror.


metattach d0 d2
metattach d10 d12
metattach d40 d42

Taking Disk Cylinders From Swap on Solaris 8

Kids… DO NOT TRY THIS AT HOME! If this is not done exactly right, you will render your system unbootable and corrupt your data. That being said, under some circumstances you can take some space from your swap partition and add it to an unused one without initializing your entire disk. This is particularly useful if you decide you want to use DiskSuite to mirror your system disk, but have not allocated the 100MB partition that is needed to hold the state databases. As always, BACK EVERYTHING UP FIRST. Better yet, make two backups and store them on two different systems. This is a risky procedure, and you don’t want to lose any data!

You can also use my instructions for copying a Solaris boot drive to a disk with a different partition layout as a safer alternative.

The first thing you need to do is figure out if your disk layout will allow for this procedure. Usually the swap partition is the second one on the disk, making it partition number 1 (Partition number 0 is root). If partition number 1 is swap on your system, and partition number 3 or 4 are unused, you are in good shape, and this should work. To figure this out, you should do something like this:

# format
Select the boot disk – usually disk 0
Specify disk (enter its number): 0
format> partition
format> print

This will show you the current disk layout.


Current partition table (original):
Total disk cylinders available: 24620 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders         Size            Blocks
  0       root    wm       0 -   725        1.00GB    (726/0/0)    2097414
  1       swap    wu     726 -  9436       11.90GB    (8635/0/0)  24946515
  2     backup    wm       0 - 24619       33.92GB    (24620/0/0) 71127180
  3 unassigned    wm       0                0         (0/0/0)            0
  4 unassigned    wm       0                0         (0/0/0)            0
  5        usr    wm    9437 - 10888        2.00GB    (1452/0/0)   4194828
  6        var    wm   10889 - 18148       10.00GB    (7260/0/0)  20974140
  7 unassigned    wm   18149 - 24619        8.91GB    (6471/0/0)  18694719

Here we see that partitions 3 and 4 are unused and directly after partition 1, so we can take some space from swap and assign it to one of these. Partition 2 is, of course the entire disk. I have not tried it, so I don’t know if you could assign non-sequential cylinders to a partition that is not directly after swap.

So to take some space from partition 1 and add it to partition 3, the first thing we have to do is disable swap, so the format utility will let us change it.

Comment out the following lines in your /etc/vfstab file and reboot the system.


#/dev/dsk/c1t0d0s1         -       -               swap    -       no      -
#swap    -       /tmp    tmpfs   -       yes     - 

This will bring the system up without swap enabled. You can now edit the disk label. Remember that our cylinders need to be sequential, so always work in cylinders when using the format utility.

Re-enter the format utility, select your system disk and view the partition table:

# format
Select the boot disk – usually disk 0
Specify disk (enter its number): 0
format> partition
format> print

Again we wee that partitions 3 and 4 are unused.


Current partition table (original):
Total disk cylinders available: 24620 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders         Size            Blocks
  0       root    wm       0 -   725        1.00GB    (726/0/0)    2097414
  1       swap    wu     726 -  9436       11.90GB    (8635/0/0)  24946515
  2     backup    wm       0 - 24619       33.92GB    (24620/0/0) 71127180
  3 unassigned    wm       0                0         (0/0/0)            0
  4 unassigned    wm       0                0         (0/0/0)            0
  5        usr    wm    9437 - 10888        2.00GB    (1452/0/0)   4194828
  6        var    wm   10889 - 18148       10.00GB    (7260/0/0)  20974140
  7 unassigned    wm   18149 - 24619        8.91GB    (6471/0/0)  18694719

The first thing we need to do is take some cylinders away from partition 1. In this example, we are looking to make partition 3 roughly 100MB, so we need to take about 75 cylinders from partition 1 so that we can add it to partition 3. Parititon 1 ends at cylinder 9436, so we need to subtract 75 from that number. 9436 – 75 = 9361, so that is the new ending cylinder for partition 1. We then subtract the beginning cylinder (726) from that number to give us the new total number of cylinders for partition 1. 9361 – 726 = 8635, so this is the number we enter when format asks for the size of the partition. Like so:


partition> 1
Part      Tag    Flag     Cylinders         Size            Blocks
  1       swap    wu     726 -  9360       11.90GB    (8635/0/0)  24946515

Enter partition id tag[swap]: 
Enter partition permission flags[wu]: 
Enter new starting cyl[726]: 
Enter partition size[24946615b, 9436c, 12880.92mb, 12.00gb]: 8635c
partition>

Now we have to add these 75 cylinders to partition 3.


partition> 3
Part      Tag    Flag     Cylinders         Size            Blocks
  3 unassigned    wm       0                0          (0/0/0)            0

Enter partition id tag[unassigned]: 
Enter partition permission flags[wm]: 
Enter new starting cyl[0]:9361
Enter partition size[0b, 0c, 0.00mb, 0.00gb]:75c
partition>

Print out the new partition table to make sure everything lines up correctly:


partition> print
Current partition table (original):
Total disk cylinders available: 24620 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders         Size            Blocks
  0       root    wm       0 -   725        1.00GB    (726/0/0)    2097414
  1       swap    wu     726 -  9360       11.90GB    (8635/0/0)  24946515
  2     backup    wm       0 - 24619       33.92GB    (24620/0/0) 71127180
  3 unassigned    wm    9361 -  9436      107.21MB    (76/0/0)      219564
  4 unassigned    wm       0                0         (0/0/0)            0
  5        usr    wm    9437 - 10888        2.00GB    (1452/0/0)   4194828
  6        var    wm   10889 - 18148       10.00GB    (7260/0/0)  20974140
  7 unassigned    wm   18149 - 24619        8.91GB    (6471/0/0)  18694719

Partition 1 ends at cylinder 9360, and partition 3 picks right up at cylinder 9361. Partition 3 ends at cylinder 9436, and partition 5 begins at cylinder 9437. Partition 4, of course, remains unused. Since none of the cylinders overlap, we can go ahead and write the disk label out. DO NOT DO THIS if you have any doubt at all about what you have just done. By writing out the disk label, you could corrupt the data on your formated filesystems if any cylinders overlap into them. The format utility is usually pretty smart about keeping you from making mistakes, but be very careful anyway! You don’t want to end up with scrambled eggs on a disk that has valuable data on it.

partition> label
This writes out the disk label, so you can now exit the format utility and re-enable swap in your /etc/vfstab file. Simply uncomment out the following two lines and reboot the system.


/dev/dsk/c1t0d0s1         -       -               swap    -       no      -
swap    -       /tmp    tmpfs   -       yes     -

Reboot your system, and if all goes well, it will come up, and you will see that partition 3 will have a little over 100MB on it. Usually people want to do this so they can store the DiskSuite meta database on the newly created partition. If this is the case for you, you can now move on to mirroring the system disk.

Problems With Multiple MetaDB Partitions

UPDATE TO “Solaris Disk Partition Layout & Mirroring Scripts

Several Months ago, I tried to use my old mirroring scripts on a new Solaris 9 install. I found that the the kernel would panic upon reboot because it was unable to mount /. I tried many things, including opening a support call with Sun. They reviewed my scripts and said that they should work, but despite repeated tries, they did not.

In the end, I created only one metadb partition instead of two, and found that the system would boot. I attributed this to a problem with the mirror disk, until it happened to me again this week. For some reason, the implementation of Disk Suite on Solaris 9 does not accept multiple metadb partitions.

Previously, in Solaris 8, I always created a total of four metadb partitions. Two on each drive…

#!/bin/sh
#Mirrorme.sh
prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s – /dev/rdsk/c1t1d0s2
metadb -a -f -c3 /dev/dsk/c1t0d0s3 /dev/dsk/c1t1d0s3
metadb -a -f -c3 /dev/dsk/c1t0d0s4 /dev/dsk/c1t1d0s4

Currently, with Solaris 9, that method does not work, and results in a kernel panic. To resolve this issue, you must create only one metadb partition on each disk. I’ve been using s3 for this, although you could use any slice you wish.

#!/bin/sh
#Mirrorme.sh
prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s – /dev/rdsk/c1t1d0s2
metadb -a -f -c3 /dev/dsk/c1t0d0s3 /dev/dsk/c1t1d0s3

Aside from this change, the mirroring scripts continue to work. Please let me know if you find any other problems not mentioned.