Why Modern RAID 5 is Ideal for Oracle Databases

There is a convention of thought amongst Oracle DBA’s that databases should never be installed on disks that are configured into a RAID 5 array. The argument goes, that since Oracle accesses and writes to random points within relatively large files, the overhead of constantly calculating block-level parity on these files is substantial, resulting in serious performance degradation. They suggest that RAID 1 (mirroring) is the ideal disk configuration since no parity needs to be calculated, and Oracle is more than happy to divide up its database over many smaller mount points.

This way of thinking has largely been correct over the years because most systems have traditionally used software RAID. This means that the CPU of the server itself had the job of doing all those parity calculations, and it really did slow down both the server and the disk when RAID 5 configurations were used. Oracle, in particular, had a hard time with these configurations for the exact reasons the DBA’s point to.

In many cases, software RAID is still used, and to be sure, it is wholly inappropriate to deploy RAID 5 in these environments. However, it is increasingly common to find IT departments using a SAN-type architecture where the RAID type and configuration are invisible to the host operating system. In these environments, the the disk array has a dedicated controller that is singly tasked with handling all read, write, and parity operations. The RAID controller is no longer software running on a generic CPU, but rather firmware that is optimized to handle parity calculations. This results in a system where parity is calculated so quickly by the dedicated controller that differences in speed between RAID 1 and Raid 5 should be virtually nonexistent.

To prove this, I carved up our new InfoTrend EonStor A12F-G2221 into three arrays – a RAID 5, a RAID 1, and a RAID 10. I then set out to run some benchmarks on these different arrays to see what, if any, the differences would be.

The hardware used was as follows:

  • The RAID 5 LUN consisted of 4 drives
  • The RAID 1 LUN consisted of 2 drives
  • The RAID 10 LUN consisted of 4 drives

I then identified the iozone tests that most accurately simulated Oracle disk activity. What I really wanted to do was to simulate select and update queries on various sized files and see how the different RAID types held up under the load. To do this, I ran iozone, a well-respected benchmark utility, with the following arguments:

/opt/iozone/bin/iozone -Ra -g 2G -b /home/sysop/new/raid5-2G-1.wks

This put the disk through its paces, as it ran the iozone tests in automatic mode on a 2 Gb file, but in the end, I was interested in analyzing the following tests because they were the ones our DBA team suggested would most closely represent database activity.

Random Read (select queries)

This test measures the performance of reading a file with accesses being made to random locations within the file. The performance of a system under this type of activity can be impacted by several factors such as: Size of operating system’s cache, number of disks, seek latencies, and others.

Random Write (update queries)

This test measures the performance of writing a file with accesses being made to random locations within the file. Again the performance of a system under this type of activity can be impacted by several factors such as: Size of operating system’s cache, number of disks, seek latencies, and others.

Strided Read (more complex select queries)

This test measures the performance of reading a file with a strided access behavior. An example would be: Read at offset zero for a length of 4 Kbytes, then seek 200 Kbytes, and then read for a length of 4 Kbytes, then seek 200 Kbytes and so on. Here the pattern is to read 4 Kbytes and then %%[Page: 3]%%

I ran several instances of the same tests using the same command line to ensure that there were no anomalies, and the machine was doing nothing else during the tests besides running the host OS. The results were pretty much as I expected, and I found little to no variation between the raid types on this disk subsystem.

Random Read Tests:
Raid5-random-read.jpg
Raid1-random-read.jpg
Raid10-random-read.jpg

In this test, there seems to be the slightest advantage to the mirror-type RAID arrays when it comes to very small files. This, I suspect can be attributed to actual drive head latency as, in RAID 5 volumes, the correct block needs to be found on a larger number of disks. This advantage quickly falls off, however as the file size grows, meaning that this slight advantage would not be seen in an Oracle database.

Random Write Tests:
Raid5-random-write.jpg
Raid1-random-write.jpg
Raid10-random-write.jpg

In this test, both RAID 5 and RAID 10 seem to hold a slight advantage over the direct mirror. This, I would imagine can be attributed to the fact that the writes are happening over a larger number of spindles. This indicates that the controller is calculating the parity faster than the 2Gb connection speed to the disk subsystem. Again, the variation is incredibly small, so there is no arguable performance advantage to using one type of RAID over another when using a hardware controller.

Stride Read Tests:
Raid5-strided-read.jpg
Raid1-stride-read.jpg
Raid10-stride-read.jpg

Here again we see no real advantage to one RAID type over any other. It could be said that the RAID 10 volume held up ever-so-slightly better on this test, but any edge is so slight that it would be hard to imagine how this could translate into a noticeable performance gain in an Oracle database.

In the end, these tests proved my suspicion that hardware RAID controllers have become so efficient and fast that it no longer makes any real difference what type of RAID you decide to use for your Oracle database. Largely gone are the days when your disk space and RAID volumes were inexorably tied to the server itself. So long as you are using hardware RAID, and the LUNS are abstracted from your operating system, you can largely feel free to make the most of your storage dollar by using RAID 5 in your production database environments.

Changing Linux Mount Points

If you’re familiar with UNIX, you know that changing mount points is really pretty easy. All you have to do is go into “/etc/fstab”, “/etc/vfstab” (or whatever your flavor of UNIX happens to call its filesystem table) and change the mount directory.

If, for instance, you had a Solaris box, and you wanted to make the disk currently mounted as “/data” be mounted as “/database”, all you would have to do is the following:

# umount /data
# mv /data /database
Change this line in “/etc/vfstab” from something like this:
/dev/dsk/c1d0s6 /dev/rdsk/c1d0s6 /data ufs 1 yes –
to something like this:
/dev/dsk/c1d0s6 /dev/rdsk/c1d0s6 /database ufs 1 yes –
and remount it as “/database”.
# mount /database

With Linux, however, it’s not quite so clear anymore… It’s still easy, but it’s just not so clear what you have to do since they have now taken to mounting filesystems using the volume label. Rather than pointing directly to the disk device, Linux points to the label, and “/etc/fstab” look more like this:

LABEL=/data /data ext3 defaults 1 2

You can always simply change the disk label, but if you don’t care, you can just tell linux where the raw device is, bypassing the need to worry about the label. The easiest way to do this is simply to replace the “LABEL=/data” value to the “/dev” entry of the disk itself. Then, simply change “/data” to “/database” and you’re all set.

Here is an example of what you would do to change the mountpoint of “/data” to /database”:

# umount /data
# mv /data /database
Change this line in “/etc/fstab” from this:
LABEL=/data /data ext3 defaults 1 2
to this:
/dev/sda6 /database ext3 defaults 1 2
and remount it as /database
# mount /database

Remembering to change the example values here with those required for your situation.