• So you have used SVM to mirror your disk, and one of the two drives fails. Aren’t you glad you mirrored them! You don’t have to do a restore from tape, but you are going have to replace the failed drive.

    Many modern RAID arrays just require you to take out the bad drive and plug in the new one, while everything else is taken care of automatically. It’s not quite that easy on a Sun server, but it’s really just a few simple steps. I just had to do this, so I thought I would write down the procedure here.

    Basically, the process boils down to the following steps:

    • Detach the failed meta devices from the failed drive
    • Delete the meta devices from the failed drive
    • Delete the meta databases from the failed drive
    • Unconfigure the failed drive
    • Remove and replace the failed drive
    • Configure the new drive
    • Copy the remaining drive’s partition table to the new drive
    • Re-create the meta databases on the new drive
    • Install the bootblocks on the new drive
    • Recreate the meta devices
    • Attach the meta devices

    Let’s look at each step individually. In my case, c0t1d0 has failed, so, I detach all meta devices on that disk and then delete them:


    # metadetach -f d0 d2
    # metadetach -f d10 d12
    # metadetach -f d40 d42
    # metaclear d2
    # metaclear d12
    # metaclear d42

    Next I take a look at the status of my meta databases. Below we can see the the replicas on that disk have write errors:

    # metadb -i
            flags           first blk       block count
         a m  p  luo        16               8192            /dev/dsk/c0t0d0s3
         a    p  luo        8208             8192            /dev/dsk/c0t0d0s3
         W    p  luo        16                8192            /dev/dsk/c0t1d0s3
         W    p  luo        8208            8192            /dev/dsk/c0t1d0s3
     r - replica does not have device relocation information
     o - replica active prior to last mddb configuration change
     u - replica is up to date
     l - locator for this replica was read successfully
     c - replica's location was in /etc/lvm/mddb.cf
     p - replica's location was patched in kernel
     m - replica is master, this is replica selected as input
     W - replica has device write errors
     a - replica is active, commits are occurring to this replica
     M - replica had problem with master blocks
     D - replica had problem with data blocks
     F - replica had format problems
     S - replica is too small to hold current data base
     R - replica had device read errors
    

    The replicas on c0t1d0s3 are dead to us, so let’s wipe them out!


    # metadb -d c0t1d0s3
    # metadb -i

            flags           first blk       block count
         a m  p  luo        16               8192            /dev/dsk/c0t0d0s3
         a    p  luo        8208             8192            /dev/dsk/c0t0d0s3
    

    The only replicas we have left are on c0t0d0s3, so I’m all clear to unconfigure the device. I run cfgadm to get the c0 path:


    # cfgadm -al

    Ap_Id                          Type         Receptacle   Occupant     Condition
    c0                             scsi-bus     connected    configured   unknown
    c0::dsk/c0t0d0                 disk         connected    configured   unknown
    c0::dsk/c0t1d0                 disk         connected    configured   unknown
    c0::dsk/c0t2d0                 disk         connected    configured   unknown
    c0::dsk/c0t3d0                 disk         connected    configured   unknown
    c1                             scsi-bus     connected    configured   unknown
    c1::dsk/c1t0d0                 CD-ROM       connected    configured   unknown
    usb0/1                         unknown      empty        unconfigured ok
    usb0/2                         unknown      empty        unconfigured ok
    usb1/1.1                       unknown      empty        unconfigured ok
    usb1/1.2                       unknown      empty        unconfigured ok
    usb1/1.3                       unknown      empty        unconfigured ok
    usb1/1.4                       unknown      empty        unconfigured ok
    usb1/2                         unknown      empty        unconfigured ok
    

    I run the following command to unconfigure the failed drive:


    # cfgadm -c unconfigure c0::dsk/c0t1d0

    The drive light turns blue
    Pull the failed drive out
    Insert the new drive

    Configure the new drive:


    # cfgadm -c configure c0::dsk/c0t1d0

    Now that the drive is configured and visible from within the format command, we can copy the partition table from the remaining mirror member:


    # prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c0t1d0s2

    Next, I install the bootblocks onto the new drive:


    # installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t1d0s2

    Create the state replicas:


    metadb -a -c 2 c0t1d0s3

    Recreate the meta devices:

    metainit -f d2 1 1 c0t1d0s0
    metainit -f d12 1 1 c0t1d0s1
    metainit -f d42 1 1 c0t1d0s4

    And finally, reattach the metadevices which will sync them up with the mirror.


    metattach d0 d2
    metattach d10 d12
    metattach d40 d42

    This entry was posted on Monday, February 16th, 2009 at 1:45 am and is filed under Data and Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
  • 7 Comments

    Take a look at some of the responses we've had to this article.

    1. Pete
      Apr 20th
      Reply

      Nice post, I just used it to replace a failed drive on my SUN Fire V440. I think you might have left off one step. You need to put back the metadbs on the new disk.

      Example:
      metadb -a -c 2 c0t0d0s3

      • May 6th
        Reply

        So I did. Thanks for the catch. Edit made.

    2. May 6th
      Reply

      prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s – /dev/rdsk/c1t0d0s2

      should be

      prtvtoc /dev/rdsk/c1t1d0s2 | fmthard -s – /dev/rdsk/c1t0d0s2

      • May 6th
        Reply

        So it should. Thanks for the catch. Edit made.

    3. Z
      Jul 6th
      Reply

      You have “metadb -a -c 3 c0t0d0s3″ but you probably mean “metadb -a -c 2 c1t0d0s3″ if the device names you’ve been working with in the article should stay consistent.

    4. all this worked GREAT for me! one minor hickup though. the disk I replaced(came from Sun) had an EFI disk label on it which causes fmthards to barf. ‘fdisk -B /dev/rdsk/c*t*d*p0″ fixed that though.

    5. Khyron
      Oct 13th
      Reply

      If this disk you replaced was a boot disk, you’ll need to restore the boot block on the disk as well.

      See installboot(1M).

  • Leave a Reply

    Let us know what you thought.

  • Name (required):

    Email (required):

    Website:

    Message:

Visitors have tagged this post: metareplace unknown metadevice type (33) - How do you replace a failed boot disk under meta in solaris? Step by step explanation? (27) - metareplace (24) - unknown metadevice type (21) - svm replace failed disk (17) - replica had problem with master blocks (17) - solaris metareplace (17) - svm replace disk (17) - metadb master (15) - replace disk svm (13) - svm mirror (10) - drive (9) - replace failed disk SVM (9) - metareplace \"unknown metadevice type\ (9) - svm metareplace (8) - SVM replace failed mirror disk (7) - cfgadm failed (7) - \"unknown metadevice type\ (7) - replace disk in SVM (7) - SVM disk replacement (6) - root disk replacement in SVM (6) - svm solaris 10 (6) - replace mirrored drive (6) - replace svm root disk (6) - replace root disk in SVM (6) - solaris 10 replace disk cfgadm (6) - replacing disk in SVM (6) - replace svm disk (5) - metareplace: unknown metadevice type (5) - how to replace a disk in SVM (5) - metareplace svm (5) - replacing a mirrored drive (5) - replace failed mirror drive (5) - how to replace a failed disk in svm (5) - replace root disk svm (5) - M - replica had problem with master blocks (5) - svm disk replace (5) - replacing disk with svm (5) - solaris metareplace failed disk (5) - replace disk solaris 8 (5) - \"unknown metadevice type\ metareplace (5) - SVM in solaris 10 (5) - solaris raid replacing a failed drive (5) - metareplace root (5) - how to replace a mirror disk in solaris (5) - how to miror root disk on solaris10 (5) - solaris svm mirror fmthard (5) - solaris svm mirror (4) - solaris Replica had problem with master blocks (4) - replace failed raid drive (4) - metareplace in svm (4) - how to replace svm disk (4) - how to replace a disk svm (4) - mirror drive (4) - how to replace root disk in SVM (4) - replacing mirrored drives (4) - svm mirror root disk (4) - How do you replace a failed boot disk under meta in solaris (4) - how to replace a mirrored drive (4) - replica has device write errors (4) - solaris 10 replace mirror disk (4) - replace failed disk in svm (4) - failed disk replacement in SVM (4) - replace boot drive svm (4) - replace failed root disk disksuite (4) - how to replace failed disk in SVM (4) - SVM master replica (3) - how to replace failed mirror disk in svm (3) - disk replacement in svm (3) - metadb in svm (3) - remove svm mirror (3) - cfgadm disk connected configured failing (3) - cfgadm disk failing (3) - sun svm replace failed drive mirror (3) - solaris 10 replace mirrored disk (3) - how to replace disk in SVM (3) - how to mirror drives (3) - solaris replace disk metareplace (3) - cfgadm failing (3) - failed drive (3) - cfgadm remove failing (3) - svm mirroring (3) - root mirroring svm (3) - solaris replace mirror disk (3) - how to replace a disk under svm (3) - metareplace in solaris (3) - boot disk replacement SVM (3) - solaris metareplace disk (3) - SVM step by step guide (3) - svm replace failed root disk (3) - solaris 10 cfgadm replace disk (3) - Solaris 10 replacing a root disk in SVM (3) - replacing the boot disk on a mirrored solaris system (3) - replica\'s location was patched in kernel (3) - svm replacing a disk (3) - how to replace failed root disk in svm (3) - replace a failed boot disk under meta in solaris (3) - replace failed svm disk (3) - SVM root disk fail (3) - Replace Failed disksuite mirror (3) - replace failed mirror svm (3) - solaris 10 replace disk mirror (3) - how to replace failed disk svm (3) - replace solaris boot disk (3) - replace mirror disk solaris 10 (3) - replace solaris SVM boot disk (3) - root mirroring in svm (3) - how to run metarepace on solaris 10 (3) - replica had problem with data blocks (3) - svm failed disk replace (3) - replacing a svm disk (3) - svm replace a failed disk mirror (3) - solaris not metadevice metareplace (3) - solaris:replace disk in raid 5 (3) - how to replace the disk using svm (3) - how to replace a failed mirror drive and svm (3) - replace mirrored bootdik solaris10 (3) - removing a disk from SVM (3) - svm replace disk drive (3) - how to remove a failed mirror disk in solaris (3) - how to remove failed mirror drive in svm (3) - remove svm (3) - metadb prtvtoc (2) - mirror disk solaris 10 (2) - solaris 10 disk mirroring prtvtoc (2) - replica does not have device relocation information (2) - cfgadm configured failing (2) - solaris svm replace drive metareplace -e (2) - replace drive metareplace format (2) - replace failed mirrir disk in SVM Solaris (2) - Replace mirrored disk solaris 8 (2) - solaris disk mirror failed (2) - cfgadm configure metareplace (2) - metareplace mirror (2) - how to remove mirror in solaris 8 (2) - cfgadm failing remove (2) - solaris replace failed disk svm (2) - metareplace cfgadm (2) - replica is master svm (2) - cfgadm replace failed disk (2) - mirror drive replacement (2) - solaris cfgadm missing a drive (2) - how to recover metadb the replicas in svm (2) - sun svm mirroring (2) - svm boot disk fail (2) - master metadb (2) - replace disk svm metareplace (2) - metadb missing master (2) - how to replace a boot disk in SVM (2) - solaris boot failed drive (2) -