RAID FAILURE AND DATA RECOVERY
 

 A redundant array of independent disks (RAID) can offer increased performance, fault tolerance or both, depending on the RAID level. RAID volumes, however, are still susceptible to failure. Your chances for a successful RAID data recovery depend on the type of RAID failure you experience. The three most common types of RAID failures are:

  • Failure of one RAID member disk.
  • Failure of more than one RAID member disk.
  • RAID failure not caused by a malfunctioning disk.
RAID RECOVERY AFTER ONE FAILED DISK
 
Some RAID levels, including RAID 1 and RAID 5, are designed with a specific level of fault tolerance. Fault tolerance for a RAID 1 is expressed as n-1, where n is the number of disks in the RAID volume. For example, if there are four disks in a RAID 1, then three disks can fail and the volume will continue to function. RAID 5 has a fault tolerance of one drive. RAID 6 has a fault tolerance of two drives.

This is how a RAID 1, RAID 5 or RAID 6 is designed to function. Manual data recovery in a fault tolerant RAID configuration isn’t necessary, as the data gets rebuilt from the parity data or mirrored drive(s). However, a RAID with a failed drive is considered in a critical state—that is, it is at risk for a complete failure if more drives fail. To bring it back to a healthy state, simply hot swap out the malfunctioning drive with a new one.

RAID RECOVERY AFTER FAILED DISKS
 
If the fault tolerance is exceeded, the RAID fails. In these cases, there is very rarely an easy fix that allows you to recover all of your data. This is because the data will have been “striped” across multiple disks, including the failed disks. Save for some very small files (i.e. those that are smaller than the block size), it’s highly unlikely that all of the parts that comprise a given file will be present on the healthy drives.

This is the least desirable position to be in. As far a do-it-yourself option, you may be able to recover your data by imaging the failed disk and repairing it. Combined with images of the healthy disks, you may be able to rebuild the RAID as a virtual RAID using advanced data recovery software.
 
RAID RECOVERY AFTER OTHER RAID FAILURES
 
RAID volumes can also fail even if the disks are functional. This is usually caused be power outages, user error, a failed RAID controller or RAID driver corruption. In these cases, the meta data for the RAID is lost or corrupted. If it’s a simple matter of the RAID controller hardware or drivers going awry, you may have some success by replacing the controller. But if there is a more systemic failure, you should attempt to rebuild the RAID virtually, similar to a multiple disk failure. However, because you have a full set of healthy disks, your chances for a successful RAID recovery will be much greater. Still, it’s a best practice to work with disk images rather than the physical disks themselves. Recreating the RAID with the proper parameters may take some trial and error, and mounting disks in an invalid RAID configuration may cause data loss.
 
 
Back to the main page