Making RHEL 3 See Multiple LUNS

For some reason RHEL 3 comes out of the box configured to see only the first Lun on a SCSI channel. This is usually not a problem, as the first Lun is all you care about, but in some instances, you will need to configure the SCSI module to see multiple Luns.

In this case we are using an Adaptec DuraStor 6200S, which is set up to present the RAID controller as Lun 00, and the actual RAID array as Lun 01. Without any modifications to the system, we plug in in, and after a reboot check /proc/scsi/scsi. We can see the RAID controller, but since we can only see the first Lun on the channel, we never get to the array:

Host: scsi2 Channel: 00 Id: 00 Lun: 00
Vendor: Adaptec Model: DuraStor 6200S Rev: V100
Type: Processor ANSI SCSI revision: 03

The actual array would show up as “Channel: 00 Id: 00 Lun: 01″, but it’s not there. To resolve this, we have to first edit “/etc/modules.conf” and add the following line:

options scsi_mod max_scsi_luns=128 scsi_allow_ghost_devices=1

In our case, modules.conf looks like this after the modification:

alias eth0 e1000
alias eth1 e1000
alias scsi_hostadapter megaraid2
alias usb-controller usb-uhci
alias usb-controller1 ehci-hcd
alias scsi_hostadapter1 aic7xxx
options scsi_mod max_scsi_luns=128 scsi_allow_ghost_devices=1

Next we have to build a new initrd image. This is done with the “mkinitrd” command.

WARNING: MAKE DARN SURE you build this against the right kernel (the kernel you want to use). If you are going to replace your current initrd image with the new one, you should make a back-up copy first. The -f option will force or overwrite the current initrd image file.

cp /boot/initrd-2.4.21-47.ELsmp.img /boot/initrd-2.4.21-47.ELsmp.img.bak
mkinitrd -f -v /boot/initrd-2.4.21-47.ELsmp.img 2.4.21-47.ELsmp

Once this is done, you can reboot your machine, and check “/proc/scsi/scsi” to see confirm that it sees the second Lun. You should see something like this:

Host: scsi2 Channel: 00 Id: 00 Lun: 00
Vendor: Adaptec Model: DuraStor 6200S Rev: V100
Type: Processor ANSI SCSI revision: 03

Host: scsi2 Channel: 00 Id: 00 Lun: 01
Vendor: Adaptec Model: DuraStor 6200S Rev: V100
Type: Direct-Access ANSI SCSI revision: 03

Hat Tip: Alan Baker for help figuring this out.
UPDATE: RHEL 4 doest not have this problem.

Why Modern RAID 5 is Ideal for Oracle Databases

There is a convention of thought amongst Oracle DBA’s that databases should never be installed on disks that are configured into a RAID 5 array. The argument goes, that since Oracle accesses and writes to random points within relatively large files, the overhead of constantly calculating block-level parity on these files is substantial, resulting in serious performance degradation. They suggest that RAID 1 (mirroring) is the ideal disk configuration since no parity needs to be calculated, and Oracle is more than happy to divide up its database over many smaller mount points.

This way of thinking has largely been correct over the years because most systems have traditionally used software RAID. This means that the CPU of the server itself had the job of doing all those parity calculations, and it really did slow down both the server and the disk when RAID 5 configurations were used. Oracle, in particular, had a hard time with these configurations for the exact reasons the DBA’s point to.

In many cases, software RAID is still used, and to be sure, it is wholly inappropriate to deploy RAID 5 in these environments. However, it is increasingly common to find IT departments using a SAN-type architecture where the RAID type and configuration are invisible to the host operating system. In these environments, the the disk array has a dedicated controller that is singly tasked with handling all read, write, and parity operations. The RAID controller is no longer software running on a generic CPU, but rather firmware that is optimized to handle parity calculations. This results in a system where parity is calculated so quickly by the dedicated controller that differences in speed between RAID 1 and Raid 5 should be virtually nonexistent.

To prove this, I carved up our new InfoTrend EonStor A12F-G2221 into three arrays – a RAID 5, a RAID 1, and a RAID 10. I then set out to run some benchmarks on these different arrays to see what, if any, the differences would be.

The hardware used was as follows:

  • The RAID 5 LUN consisted of 4 drives
  • The RAID 1 LUN consisted of 2 drives
  • The RAID 10 LUN consisted of 4 drives

I then identified the iozone tests that most accurately simulated Oracle disk activity. What I really wanted to do was to simulate select and update queries on various sized files and see how the different RAID types held up under the load. To do this, I ran iozone, a well-respected benchmark utility, with the following arguments:

/opt/iozone/bin/iozone -Ra -g 2G -b /home/sysop/new/raid5-2G-1.wks

This put the disk through its paces, as it ran the iozone tests in automatic mode on a 2 Gb file, but in the end, I was interested in analyzing the following tests because they were the ones our DBA team suggested would most closely represent database activity.

Random Read (select queries)

This test measures the performance of reading a file with accesses being made to random locations within the file. The performance of a system under this type of activity can be impacted by several factors such as: Size of operating system’s cache, number of disks, seek latencies, and others.

Random Write (update queries)

This test measures the performance of writing a file with accesses being made to random locations within the file. Again the performance of a system under this type of activity can be impacted by several factors such as: Size of operating system’s cache, number of disks, seek latencies, and others.

Strided Read (more complex select queries)

This test measures the performance of reading a file with a strided access behavior. An example would be: Read at offset zero for a length of 4 Kbytes, then seek 200 Kbytes, and then read for a length of 4 Kbytes, then seek 200 Kbytes and so on. Here the pattern is to read 4 Kbytes and then %%[Page: 3]%%

I ran several instances of the same tests using the same command line to ensure that there were no anomalies, and the machine was doing nothing else during the tests besides running the host OS. The results were pretty much as I expected, and I found little to no variation between the raid types on this disk subsystem.

Random Read Tests:
Raid5-random-read.jpg
Raid1-random-read.jpg
Raid10-random-read.jpg

In this test, there seems to be the slightest advantage to the mirror-type RAID arrays when it comes to very small files. This, I suspect can be attributed to actual drive head latency as, in RAID 5 volumes, the correct block needs to be found on a larger number of disks. This advantage quickly falls off, however as the file size grows, meaning that this slight advantage would not be seen in an Oracle database.

Random Write Tests:
Raid5-random-write.jpg
Raid1-random-write.jpg
Raid10-random-write.jpg

In this test, both RAID 5 and RAID 10 seem to hold a slight advantage over the direct mirror. This, I would imagine can be attributed to the fact that the writes are happening over a larger number of spindles. This indicates that the controller is calculating the parity faster than the 2Gb connection speed to the disk subsystem. Again, the variation is incredibly small, so there is no arguable performance advantage to using one type of RAID over another when using a hardware controller.

Stride Read Tests:
Raid5-strided-read.jpg
Raid1-stride-read.jpg
Raid10-stride-read.jpg

Here again we see no real advantage to one RAID type over any other. It could be said that the RAID 10 volume held up ever-so-slightly better on this test, but any edge is so slight that it would be hard to imagine how this could translate into a noticeable performance gain in an Oracle database.

In the end, these tests proved my suspicion that hardware RAID controllers have become so efficient and fast that it no longer makes any real difference what type of RAID you decide to use for your Oracle database. Largely gone are the days when your disk space and RAID volumes were inexorably tied to the server itself. So long as you are using hardware RAID, and the LUNS are abstracted from your operating system, you can largely feel free to make the most of your storage dollar by using RAID 5 in your production database environments.