Replace Failed SVM Mirror Drive

So you have used SVM to mirror your disk, and one of the two drives fails. Aren’t you glad you mirrored them! You don’t have to do a restore from tape, but you are going have to replace the failed drive.

Many modern RAID arrays just require you to take out the bad drive and plug in the new one, while everything else is taken care of automatically. It’s not quite that easy on a Sun server, but it’s really just a few simple steps. I just had to do this, so I thought I would write down the procedure here.

Basically, the process boils down to the following steps:

  • Detach the failed meta devices from the failed drive
  • Delete the meta devices from the failed drive
  • Delete the meta databases from the failed drive
  • Unconfigure the failed drive
  • Remove and replace the failed drive
  • Configure the new drive
  • Copy the remaining drive’s partition table to the new drive
  • Re-create the meta databases on the new drive
  • Install the bootblocks on the new drive
  • Recreate the meta devices
  • Attach the meta devices

Let’s look at each step individually. In my case, c0t1d0 has failed, so, I detach all meta devices on that disk and then delete them:


# metadetach -f d0 d2
# metadetach -f d10 d12
# metadetach -f d40 d42
# metaclear d2
# metaclear d12
# metaclear d42

Next I take a look at the status of my meta databases. Below we can see the the replicas on that disk have write errors:

# metadb -i
        flags           first blk       block count
     a m  p  luo        16               8192            /dev/dsk/c0t0d0s3
     a    p  luo        8208             8192            /dev/dsk/c0t0d0s3
     W    p  luo        16                8192            /dev/dsk/c0t1d0s3
     W    p  luo        8208            8192            /dev/dsk/c0t1d0s3
 r - replica does not have device relocation information
 o - replica active prior to last mddb configuration change
 u - replica is up to date
 l - locator for this replica was read successfully
 c - replica's location was in /etc/lvm/mddb.cf
 p - replica's location was patched in kernel
 m - replica is master, this is replica selected as input
 W - replica has device write errors
 a - replica is active, commits are occurring to this replica
 M - replica had problem with master blocks
 D - replica had problem with data blocks
 F - replica had format problems
 S - replica is too small to hold current data base
 R - replica had device read errors

The replicas on c0t1d0s3 are dead to us, so let’s wipe them out!


# metadb -d c0t1d0s3
# metadb -i

        flags           first blk       block count
     a m  p  luo        16               8192            /dev/dsk/c0t0d0s3
     a    p  luo        8208             8192            /dev/dsk/c0t0d0s3

The only replicas we have left are on c0t0d0s3, so I’m all clear to unconfigure the device. I run cfgadm to get the c0 path:


# cfgadm -al

Ap_Id                          Type         Receptacle   Occupant     Condition
c0                             scsi-bus     connected    configured   unknown
c0::dsk/c0t0d0                 disk         connected    configured   unknown
c0::dsk/c0t1d0                 disk         connected    configured   unknown
c0::dsk/c0t2d0                 disk         connected    configured   unknown
c0::dsk/c0t3d0                 disk         connected    configured   unknown
c1                             scsi-bus     connected    configured   unknown
c1::dsk/c1t0d0                 CD-ROM       connected    configured   unknown
usb0/1                         unknown      empty        unconfigured ok
usb0/2                         unknown      empty        unconfigured ok
usb1/1.1                       unknown      empty        unconfigured ok
usb1/1.2                       unknown      empty        unconfigured ok
usb1/1.3                       unknown      empty        unconfigured ok
usb1/1.4                       unknown      empty        unconfigured ok
usb1/2                         unknown      empty        unconfigured ok

I run the following command to unconfigure the failed drive:


# cfgadm -c unconfigure c0::dsk/c0t1d0

The drive light turns blue
Pull the failed drive out
Insert the new drive

Configure the new drive:


# cfgadm -c configure c0::dsk/c0t1d0

Now that the drive is configured and visible from within the format command, we can copy the partition table from the remaining mirror member:


# prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c0t1d0s2

Next, I install the bootblocks onto the new drive:


# installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t1d0s2

Create the state replicas:


metadb -a -c 2 c0t1d0s3

Recreate the meta devices:

metainit -f d2 1 1 c0t1d0s0
metainit -f d12 1 1 c0t1d0s1
metainit -f d42 1 1 c0t1d0s4

And finally, reattach the metadevices which will sync them up with the mirror.


metattach d0 d2
metattach d10 d12
metattach d40 d42

Making RHEL 3 See Multiple LUNS

For some reason RHEL 3 comes out of the box configured to see only the first Lun on a SCSI channel. This is usually not a problem, as the first Lun is all you care about, but in some instances, you will need to configure the SCSI module to see multiple Luns.

In this case we are using an Adaptec DuraStor 6200S, which is set up to present the RAID controller as Lun 00, and the actual RAID array as Lun 01. Without any modifications to the system, we plug in in, and after a reboot check /proc/scsi/scsi. We can see the RAID controller, but since we can only see the first Lun on the channel, we never get to the array:

Host: scsi2 Channel: 00 Id: 00 Lun: 00
Vendor: Adaptec Model: DuraStor 6200S Rev: V100
Type: Processor ANSI SCSI revision: 03

The actual array would show up as “Channel: 00 Id: 00 Lun: 01”, but it’s not there. To resolve this, we have to first edit “/etc/modules.conf” and add the following line:

options scsi_mod max_scsi_luns=128 scsi_allow_ghost_devices=1

In our case, modules.conf looks like this after the modification:

alias eth0 e1000
alias eth1 e1000
alias scsi_hostadapter megaraid2
alias usb-controller usb-uhci
alias usb-controller1 ehci-hcd
alias scsi_hostadapter1 aic7xxx
options scsi_mod max_scsi_luns=128 scsi_allow_ghost_devices=1

Next we have to build a new initrd image. This is done with the “mkinitrd” command.

WARNING: MAKE DARN SURE you build this against the right kernel (the kernel you want to use). If you are going to replace your current initrd image with the new one, you should make a back-up copy first. The -f option will force or overwrite the current initrd image file.

cp /boot/initrd-2.4.21-47.ELsmp.img /boot/initrd-2.4.21-47.ELsmp.img.bak
mkinitrd -f -v /boot/initrd-2.4.21-47.ELsmp.img 2.4.21-47.ELsmp

Once this is done, you can reboot your machine, and check “/proc/scsi/scsi” to see confirm that it sees the second Lun. You should see something like this:

Host: scsi2 Channel: 00 Id: 00 Lun: 00
Vendor: Adaptec Model: DuraStor 6200S Rev: V100
Type: Processor ANSI SCSI revision: 03

Host: scsi2 Channel: 00 Id: 00 Lun: 01
Vendor: Adaptec Model: DuraStor 6200S Rev: V100
Type: Direct-Access ANSI SCSI revision: 03

Hat Tip: Alan Baker for help figuring this out.
UPDATE: RHEL 4 doest not have this problem.

Why Modern RAID 5 is Ideal for Oracle Databases

There is a convention of thought amongst Oracle DBA’s that databases should never be installed on disks that are configured into a RAID 5 array. The argument goes, that since Oracle accesses and writes to random points within relatively large files, the overhead of constantly calculating block-level parity on these files is substantial, resulting in serious performance degradation. They suggest that RAID 1 (mirroring) is the ideal disk configuration since no parity needs to be calculated, and Oracle is more than happy to divide up its database over many smaller mount points.

This way of thinking has largely been correct over the years because most systems have traditionally used software RAID. This means that the CPU of the server itself had the job of doing all those parity calculations, and it really did slow down both the server and the disk when RAID 5 configurations were used. Oracle, in particular, had a hard time with these configurations for the exact reasons the DBA’s point to.

In many cases, software RAID is still used, and to be sure, it is wholly inappropriate to deploy RAID 5 in these environments. However, it is increasingly common to find IT departments using a SAN-type architecture where the RAID type and configuration are invisible to the host operating system. In these environments, the the disk array has a dedicated controller that is singly tasked with handling all read, write, and parity operations. The RAID controller is no longer software running on a generic CPU, but rather firmware that is optimized to handle parity calculations. This results in a system where parity is calculated so quickly by the dedicated controller that differences in speed between RAID 1 and Raid 5 should be virtually nonexistent.

To prove this, I carved up our new InfoTrend EonStor A12F-G2221 into three arrays – a RAID 5, a RAID 1, and a RAID 10. I then set out to run some benchmarks on these different arrays to see what, if any, the differences would be.

The hardware used was as follows:

  • The RAID 5 LUN consisted of 4 drives
  • The RAID 1 LUN consisted of 2 drives
  • The RAID 10 LUN consisted of 4 drives

I then identified the iozone tests that most accurately simulated Oracle disk activity. What I really wanted to do was to simulate select and update queries on various sized files and see how the different RAID types held up under the load. To do this, I ran iozone, a well-respected benchmark utility, with the following arguments:

/opt/iozone/bin/iozone -Ra -g 2G -b /home/sysop/new/raid5-2G-1.wks

This put the disk through its paces, as it ran the iozone tests in automatic mode on a 2 Gb file, but in the end, I was interested in analyzing the following tests because they were the ones our DBA team suggested would most closely represent database activity.

Random Read (select queries)

This test measures the performance of reading a file with accesses being made to random locations within the file. The performance of a system under this type of activity can be impacted by several factors such as: Size of operating system’s cache, number of disks, seek latencies, and others.

Random Write (update queries)

This test measures the performance of writing a file with accesses being made to random locations within the file. Again the performance of a system under this type of activity can be impacted by several factors such as: Size of operating system’s cache, number of disks, seek latencies, and others.

Strided Read (more complex select queries)

This test measures the performance of reading a file with a strided access behavior. An example would be: Read at offset zero for a length of 4 Kbytes, then seek 200 Kbytes, and then read for a length of 4 Kbytes, then seek 200 Kbytes and so on. Here the pattern is to read 4 Kbytes and then %%[Page: 3]%%

I ran several instances of the same tests using the same command line to ensure that there were no anomalies, and the machine was doing nothing else during the tests besides running the host OS. The results were pretty much as I expected, and I found little to no variation between the raid types on this disk subsystem.

Random Read Tests:
Raid5-random-read.jpg
Raid1-random-read.jpg
Raid10-random-read.jpg

In this test, there seems to be the slightest advantage to the mirror-type RAID arrays when it comes to very small files. This, I suspect can be attributed to actual drive head latency as, in RAID 5 volumes, the correct block needs to be found on a larger number of disks. This advantage quickly falls off, however as the file size grows, meaning that this slight advantage would not be seen in an Oracle database.

Random Write Tests:
Raid5-random-write.jpg
Raid1-random-write.jpg
Raid10-random-write.jpg

In this test, both RAID 5 and RAID 10 seem to hold a slight advantage over the direct mirror. This, I would imagine can be attributed to the fact that the writes are happening over a larger number of spindles. This indicates that the controller is calculating the parity faster than the 2Gb connection speed to the disk subsystem. Again, the variation is incredibly small, so there is no arguable performance advantage to using one type of RAID over another when using a hardware controller.

Stride Read Tests:
Raid5-strided-read.jpg
Raid1-stride-read.jpg
Raid10-stride-read.jpg

Here again we see no real advantage to one RAID type over any other. It could be said that the RAID 10 volume held up ever-so-slightly better on this test, but any edge is so slight that it would be hard to imagine how this could translate into a noticeable performance gain in an Oracle database.

In the end, these tests proved my suspicion that hardware RAID controllers have become so efficient and fast that it no longer makes any real difference what type of RAID you decide to use for your Oracle database. Largely gone are the days when your disk space and RAID volumes were inexorably tied to the server itself. So long as you are using hardware RAID, and the LUNS are abstracted from your operating system, you can largely feel free to make the most of your storage dollar by using RAID 5 in your production database environments.

REL 3 Direct Connect to EonStor A12F-G2221

This summer we have been migrating a bunch of data to our shiny new InfoTrend EonStor A12F-G2221. With 1G battery backed cache, it’s a screaming box of disk, and it looks cool to boot. There is a gotcha though if you want to direct connect it to QLogic QLA2340 card on a REL 3 server. Here is what you have to do.

First, get the new driver from QLogic, or install the one that came on CD with the HBA. The one that Red Hat packages is always old and useless, and one that QLogic provides is better anyways because the installer rebuilds the rdimage for you. Once you get the package just “cd” into the “qlafc-linux-X.XX.XX-X-install” and run “qlinstall”. This will install it all for you, so let it do it’s thing, and reboot the system when it’s done.

Now, go into the management console for your EonStor A12F-G2221. For the most part, the system defaults should work, but InfoTrend sets the default Fibre Connection to “Loop Only”. This is fine if you are dealing with a san, but since we are trying to do a direct connect, we have to change it to either “Auto” or “Direct Connect”. I suggest “Auto”, since that way you can have the other port connected to a loop if you want.

EonStor

That should be all you have to do. You will have to reboot the controller for the change to take effect, so make sure you do this during a scheduled downtime if you have the disk in production.

Changing Linux Mount Points

If you’re familiar with UNIX, you know that changing mount points is really pretty easy. All you have to do is go into “/etc/fstab”, “/etc/vfstab” (or whatever your flavor of UNIX happens to call its filesystem table) and change the mount directory.

If, for instance, you had a Solaris box, and you wanted to make the disk currently mounted as “/data” be mounted as “/database”, all you would have to do is the following:

# umount /data
# mv /data /database
Change this line in “/etc/vfstab” from something like this:
/dev/dsk/c1d0s6 /dev/rdsk/c1d0s6 /data ufs 1 yes –
to something like this:
/dev/dsk/c1d0s6 /dev/rdsk/c1d0s6 /database ufs 1 yes –
and remount it as “/database”.
# mount /database

With Linux, however, it’s not quite so clear anymore… It’s still easy, but it’s just not so clear what you have to do since they have now taken to mounting filesystems using the volume label. Rather than pointing directly to the disk device, Linux points to the label, and “/etc/fstab” look more like this:

LABEL=/data /data ext3 defaults 1 2

You can always simply change the disk label, but if you don’t care, you can just tell linux where the raw device is, bypassing the need to worry about the label. The easiest way to do this is simply to replace the “LABEL=/data” value to the “/dev” entry of the disk itself. Then, simply change “/data” to “/database” and you’re all set.

Here is an example of what you would do to change the mountpoint of “/data” to /database”:

# umount /data
# mv /data /database
Change this line in “/etc/fstab” from this:
LABEL=/data /data ext3 defaults 1 2
to this:
/dev/sda6 /database ext3 defaults 1 2
and remount it as /database
# mount /database

Remembering to change the example values here with those required for your situation.

MIT Guide to Lock Picking – Appendix A

This appendix describes the design and construction of lock picking tools.

A.1 Pick Shapes

Picks come in several shapes and sizes. Figure A.1 shows the most common shapes. The handle and tang of a pick are the same for all picks. The handle must be comfortable and the tang must be thin enough to avoid bumping pins unnecessarily. If the tang is too thin, then it will act like a spring and you will loose the feel of the tip interacting with the pins. The shape of the tip determines how easily the pick passes over the pins and what kind of feedback you get from each pin.

The design of a tip is a compromise between ease of insertion, ease of withdrawal and feel of the interaction. The half diamond tip with shallow angles is easy to insert and remove, so you can apply pressure when the pick is moving in either direction. It can quickly pick a lock that has little variation in the lengths of the key pins. If the lock requires a key that has a deep cut between two shallow cuts, the pick may not be able to push the middle pin down far enough. The half diamond pick with steep angles could deal with such a lock, and in general steep angles give you better feedback about the pins. Unfortunately, the steep angles make it harder to move the pick in the lock. A tip that has a shallow front angle and a steep back angle works well for Yale locks.

The half round tip works well in disk tumbler locks. See section 9.13. The full diamond and full round tips are useful for locks that have pins at the top and bottom of the keyway. The rake tip is designed for picking pins one by one. It can also be used to rake over the pins, but the pressure can only be applied as the pick is withdrawn. The rake tip allows you to carefully feel each pin and apply varying amounts of pressure. Some rake tips are flat or dented on the top to makes it easier to align the pick on the pin. The primary benefit of picking pins one at a time is that you avoid scratching the pins. Scrubbing scratches the tips of the pins and the keyway, and it spreads metal dust throughout the lock. If you want to avoid leaving traces, you must avoid scrubbing.

The snake tip can be used for scrubbing or picking. When scrubbing, the multiple bumps generate more action than a regular pick. The snake tip is particularly good at opening five pin household locks. When a snake tip is used for picking, it can set two or three pins at once. Basically, the snake pick acts like a segment of a key which can be adjusted by lifting and lowering the tip, by tilting it back and forth, and by using either to top or bottom of the tip. You should use moderate to heavy torque with a snake pick to allow several pins to bind at the same time. This style of picking is faster than using a rake and it leaves as little evidence.

A.2 Street cleaner bristles

The spring steel bristles used on street cleaners make excellen tools for lock picking. The bristles have the right thickness and width, and they are easy to grind into the desired shape. The resulting tools are springy and strong. Section A.3 describes how to make tools that are less springy.

The first step in making tools is to sand off any rust on the bristles. Course grit sand paper works fine as does a steel wool cleaning pad (not copper wool). If the edges or tip of the bristle are worn down, use a file to make them square.

A torque wrench has a head and a handle as shown in figure A.2. The head is usually 1/2 to 3/4 of an inch long and the handle varies from 2 to 4 inches long. The head and the handle are separated by a bend that is about 80 degrees. The head must be long enough to reach over any protrusions (such as a grip-proof collar) and firmly engage the plug. A long handle allows delicate control over the torque, but if it is too long, it will bump against
the doorframe. The handle, head and bend angle can be made quite small if you want to make tools that are easy to conceal (e.g., in a pen, flashlight, or belt buckle). Some torque wrenches have a 90 degree twist in the handle. The twist makes it easy to control the torque by controlling how far the handle has been deflected from its restposition. The handle acts as a spring which sets the torque. The disadvantage of this method of setting the torque is that you get less feedback about the rotation of the plug. To pick difficult locks you will need to learn how to apply a steady torque via a stiff handled torque wrench.

The width of the head of a torque wrench determines how well it will fit the keyway. Locks with narrow keyways (e.g., desk locks) need torque wrenches with narrow heads. Before bending the bristle, file the head to the desired width. A general purpose wrench can be made by narrowing the tip (about 1/4 inch) of the head. The tip fits small keyways while the rest of the head is wide enough to grab a normal keyway.

The hard part of making a torque wrench is bending the bristle without cracking it. To make the 90 degree handle twist, clamp the head of the bristle (about one inch) in a vise and use pliers to grasp the bristle about 3/8 of an inch above the vise. You can use another pair of pliers instead of a vise. Apply a 45 degree twist. Try to keep the axis of the twist lined up with the axis of the bristle. Now move the pliers back another 3/8 inch and apply the remaining 45 45 degrees. You will need to twist the bristle more than 90 degrees in order to set a permanent 90 degree twist.


Figure A.1: Selection of pick shapes

Figure A.1: Selection of pick shapes

To make the 80 degree head bend, lift the bristle out of the vise by about 1/4 inch (so 3/4 inch is still in the vise). Place the shank of a screw driver against the bristle and bend the spring steel around it about 90 degrees. This should set a permanent 80 degree bend in the metal. Try to keep the axis of the bend perpendicular to the handle. The screwdriver shank ensures that the radius of curvature will not be too small. Any rounded object will work (e.g., drill bit, needle nose pliers, or a pen cap). If you have trouble with this method, try grasping the bristle with two pliers separated by about 1/2 inch and bend. This method produces a gentle curve that won’t break the bristle.

A grinding wheel will greatly speed the job of making a pick. It takes a bit of practice to learn how make smooth cuts with a grinding wheel, but it takes less time to practice and make two or three picks than it does to hand file a single pick. The first step is to cut the front angle of the pick. Use the front of the wheel to do this. Hold the bristle at 45 degrees to the wheel and move the bristle side to side as you grind away the metal. Grind slowly to avoid overheating the metal, which makes it brittle. If the metal changes color (to dark blue), you have overheated it, and you should grind away the colored portion. Next, cut the back angle of the tip using the corner of the wheel. Usually one corner is sharper than the other, and you should use that one. Hold the pick at the desired angle and slowly push it into the corner of the wheel. The side of the stone should cut the back angle. Be sure that the tip of the pick is supported. If the grinding wheel stage is not close enough to the wheel to support the tip, use needle nose pliers to hold the tip. The cut should pass though about 2/3 of the width of the bristle. If the tip came out well, continue. Otherwise break it off and try again. You can break the bristle by clamping it into a vise and bending it sharply.

The corner of the wheel is also used to grind the tang of the pick. Put a scratch mark to indicate how far back the tang should go. The tang should be long enough to allow the tip to pass over the back pin of a seven pin lock. Cut the tang by making several smooth passes over the corner. Each pass starts at the tip and moves to thescratch mark. Try to remove less than a 1/16th of an inch of metal with each pass. I use two fingers to hold the bristle on the stage at the proper angle while my other hand pushes the handle of the pick to move the tang along the corner. Use whatever technique works best for you.

Use a hand file to finish the pick. It should feel smooth if you run a finger nail over it. Any roughness will add noise to the feedback you want to get from the lock.

The outer sheath of phone cable can be used as a handle for the pick. Remove three or four of the wires from a length of cable and push it over the pick. If the sheath won’t stay in place, you can put some epoxy on the handle before pushing the sheath over it.

A.3 Bicycle spokes

An alternative to making tools out of street cleaner bristles is to make them out of nails and bicycle spokes. These materials are easily accessible and when they are heat treated, they will be stronger than tools made from bristles.


Figure A.2: Torque wrenches

Figure A.2: Torque wrenches

A strong torque wrench can be constructed from an 8-penny nail (about .1 inch diameter). First heat up the point with a propane torch until it glows red, slowly remove it from the flame, and let it air cool; this softens it. The burner of a gas stove can be used instead of a torch. Grind it down into the shape of a skinny screwdriver bladeand bend it to about 80 degrees. The bend should be less than a right angle because some lock faces are recessed behind a plate (called an escutcheon) and you want the head of the wrench to be able to reach about half an inch into the plug. Temper (harden) the torque wrench by heating to bright orange and dunking it into ice water. You will wind up with a virtually indestructible bent screwdriver that will last for years under brutal use.

Bicycle spokes make excellent picks. Bend one to the shape you want and file the sides of the business end flat such that it’s strong in the vertical and flexy in the horizontal direction. Try a righ t-angle hunk about an inch long for a handle. For smaller picks, which you need for those really tiny keyways, find any large-diameter spring and unbend it. If you’re careful you don’t have to play any metallurgical games.

A.4 Brick Strap

For perfectly serviceable key blanks that you can’t otherwise find at the store, use the metal strap they wrap around bricks for shipping. It’s wonderfully handy stuff for just about anything you want to manufacture. To get around side wards in the keyway, you can bend the strap lengthwise by clamping it in a vice and tapping on the protruding part to bend the piece to the required angle.

Brick strap is very hard. It can ruin a grinding wheel or key cutting machine. A hand file is the recommended tool for milling brick strap.

Back to Index >
Chapter 10 >
Appendix B >

What The Heck is RAID 10?

Earlier this month, a company came along and asked for a RAID 10 array. Understanding that RAID 10 is a cooler sounding way of saying RAID 1+0, I understood it as a mirror set that is striped across another mirror set. Simple enough… Just concatenate a couple of mirrors, and you’ve got RAID 10.

Indeed, RAID 10 is simply one or more RAID 1 arrays (mirrored sets) striped together (RAID 0).

RAID 1 creates an exact copy (or mirror) of all of data on two or more disks, while RAID 0 splits data evenly across two or more disks with no parity information for redundancy. By combining the two into a RAID 10 array, you are able to take advantage of the faster write speed offered by RAID 0, while protecting your data against drive failures with mirroring.

This method of RAID is pretty costly, but useful if you find yourself in a situation where you need a lot of throughput combined with a lot of data protection.