Home | Products | Download | Purchase | Support

DIY DataRecovery.nl

 
RAID Recovery Information

Introduction

General procedures and preparations

RAID types

Determining the state of the RAID

NAS recovery

About a 'Degraded RAID'

Walkthroughs

 
Introduction
Welcome to the RAID recovery FAQ. The aim of this document is to help you on your way when faced with a RAID recovery situation. We'll start with some general information and preparations, and after that we'll talk about possible scenarios for different RAID setups. The recovery tool of choice will be iRecover, so we'll refer to that frequently. In certain situations DiskPatch may be of help, and this will be indicated as needed.

In this document we try to explain things in such a way that even the novice computer user can understand what he or she needs to do when confronted with a RAID problem. So when you have above average experience or simply know a lot about computers, the starting bits of this guide will not be of interest to you and you can skip to the diagnose part and/or the walkthrough part. However, please do not skip the "Preparations" bit that is up next.

 
General procedures and preparations

First, this: if you have a RAID5 that had been setup as a normal RAID5 (no mirrors, no clustering) and more than 1 disk has been severely compromised, you can not recover or rebuild the RAID5. So, any normal RAID5 that misses more than 1 disk is lost. If that is the case in your situation, you can stop here and start looking for those backups.

Second: something you should NOT do is try to re-create the RAID after it has failed. In almost all cases re-creating a RAID will damage the data and make recovery more difficult. In general: if a RAID has failed, stop all activities, don't do anything with the disks, and focus on getting the necessary information before getting your hands dirty.

When faced with a RAID problem, the first thing you need to do is familiarize yourself with the hardware involved. This sounds obvious but many people run a RAID without knowing exactly what it's made of. If you need to perform a recovery you should know the following things:

  • is it a hardware- or a software RAID
  • what is the RAID type
  • how many disks are involved
  • what is the state of the RAID
  • what is the state of the disks

If you know the answers to these questions you can skip to the walkthrough part.

We'll talk a bit about these questions next.

 
RAID types
One of the major aspects of any RAID is this:
Is it a hardware RAID or a software RAID?

Hardware RAID:

If a RAID is created by and running through a specialized disk adapter (either on-board or as a separate card) we refer to it as a hardware RAID. This type of RAID is generally managed outside of the Operating System: the RAID adapter presents the disks as one big disk, and the O.S. simply accesses it as such. The individual disks can normally not be seen from within the O.S. and managing the RAID is done through a dedicated BIOS program. Sometimes a driver is required to let the O.S. know what type of adapter is running the RAID.
Note: a NAS solution that implements RAID is also considered a hardware RAID: all the disks are presented as one big disk that is shared through a network, and a dedicated piece of hardware (the NAS) is needed to run the RAID.

Software RAID:

When creating and using a RAID through functions that are offered by the Operating System, we refer to it as a software RAID. The individual disks can be seen by the O.S. and it is the O.S. that creates one big volume by combining space on the disks (no additional RAID hardware is required, all you need is a bunch of disks). A well known example is Dynamic Disks in recent versions of Windows: a user can use the Disk Administrator to assign separate disks to one big RAID volume which can then be used by the O.S. In this case it is the O.S. that manages and runs the RAID, so if something goes wrong in the O.S. you may lose access to the RAID.

RAID implementations

Aside from the differentiation of hardware and software RAID, there are several RAID implementations in use. There are 7 types, or levels, of RAID: 0 through 6. In practice though the most common ones are 0, 1 and 5. Following is a short description of these common levels:

RAID0: Striping. Data is saved across multiple disks. A fast storage method (saving to more than one disk at the same time) but not very safe (if one disk stops working the data is compromised). A stripe is a block of data that is saved to one disk, hence the name.
Example: we have a RAID0 set of three disks with one volume on them (so there is one volume on the three disks together; the volume's size is the size of the three disks combined). When we save data to this volume some (a stripe) is saved to disk 1, then a stripe to disk 2, then a stripe to disk 3, then again to disk 1 etc. until all data is saved. So a file is not necessarily on one disk. As you can see, if one disk fails, you lose a significant bit of the data.

RAID1: Mirroring. This one is simple. All data stored across disks is saved twice, once on the original disk and once on the duplicate disk. Very safe, but not very economical storage-wise: you need twice the space to save your data.
Example: we have two disks. One is set as the original, one is set as the mirror (the duplicate). Everything we save to the original disk is also saved to the mirror. So if the original fails we can simply continue using the mirror.

RAID5: Striping with parity. A very commonly used RAID level, this one uses striping like in RAID0 but adds parity. This means that control data is saved to a disk for every few stripes that are saved. So if some of the original data gets lost (if one disk fails) we can still re-create the data by looking at the remaining data and the control data. A safe way to use large amounts of storage space, but you lose a bit of the storage to the control data.

Notes:

  • Keep in mind that RAID1 can be used in combination with other RAID levels: you can create a RAID0 set and mirror that. The other way round is also possible: create a RAID1 and divide the original into a RAID0 set. These setups have some significant advantages: it's very safe, and it's quick. Obviously you need lots of disks and managing it can be a bit tricky.
  • RAID0 and RAID5 can be used either as hardware RAID or as software RAID: you can use a disk controller to build a RAID0 or -5 volume, or you can use Windows Dynamic Disks to create a software RAID0 or -5.

If you'd like to read about this in more detail, this page explains it all rather nicely.

We support recoveries for RAID0 and RAID5 sets. Obviously a RAID1 has its own built-in safety so if that fails you should be able to fix that quickly.

 
Determining the state of the RAID
This section explains how to find out what the state of your RAID is, in the broadest sense of the word. If you already know this you can skip this section and go on to the walkthroughs.

The type of failure and the type of RAID involved determines which recovery procedure has to be followed. So you need to make sure you understand what is going on, that is your responsibility. The explanation above ("RAID types") should help you with this.

Before you can start recovering data you need answers to the following questions:

1. Is my RAID a hardware- or a software RAID?
This is important mainly for the next question (number 2). Look at the RAID types section to help you find out what you have. The actual recovery procedure is the same for both hardware- and software RAIDs, but if you have a hardware RAID you have to answer one additional question, which is:
2. IF it is a hardware RAID, is it still intact nor not?
If you have problems accessing the RAID and all the hardware is still working fine (the controller and the disks are still presenting the RAID set as one volume to the Operating System), the problem may be in the actual data volume that is on the disk(s). In that case a repair may be possible. Let DiskPatch analyze the volume and see what comes up, look here for more information. You can also let iRecover analyze the RAID; after all, if it still works on a hardware level it is simply one big disk, so the recovery procedure for that will be simple.
If the hardware RAID is not intact and you have a bunch of disks that used to be the RAID, continue the diagnosis.
3. Are any of the disks that are in the RAID not working well?
In most cases a RAID failing is caused by one of two things: the adapter fails, or one of the disks in the RAID fails. If the adapter fails you might be able to get the RAID back online by replacing the adapter, though this will only work if the replacement adapter is exactly the same (usually this does not work out well and you will end up recovering the data from the RAID disks before building a new RAID).
If the adapter was not the cause of the failure you will need to make sure beforehand that the disks are working fine. All disks should be checked for health issues. You can do so by checking the SMART attributes of each disk or by running a surface scan (check the walkthroughs for details). iRecover allows you to check the SMART info for a disk; simply right-click a disk in a screen that has the disks listed and select "S.M.A.R.T. information". If you need help interpreting the info here, contact support.
If a disk has health issues you must decide how to proceed. If the problems are small (just a few read errors, this can be determined by running a surface scan) you could go ahead with the recovery, iRecover can handle a few read errors. If however the disk is in bad shape, you should clone it before starting the recovery. You can also create an image of the disk by using iRecover (right-click the disk and select "create image file"). This procedure basically clones the disk to a file, and you can use that file later on instead of the bad disk. Obviously you'll need quite some disk space to create an image file.
4. How many disks were in my RAID before things went wrong?
This one kind of speaks for itself.

Summarizing:

Know the RAID type
If a hardware RAID, know the RAID state (working or broken)
Know the state of the former RAID disks, take action if needed (clone or image)
Know how many disks were in the RAID before it failed

If you have all this, start the recovery.

 
NAS recovery
Having a NAS fail is particularly nasty. Direct disk access is not possible (most NAS setups are basically Linux systems with Ext RAIDs, connected through network) so you'll have to improvise. The good news is that a NAS usually has a set of disks that is arranged in a RAID5 config, so RAID recovery is possible.
There are some basic steps you need to take before getting started on NAS recovery:

1. Know your NAS.
Find out how many disks the NAS has, what the file system type on these disks is, and what the operating system is that runs the NAS (usually Linux, but Windows systems are also quite common). If the operating system is Linux, the disks will most likely be configured as Linux MD-RAID with the Ext(x) file system. If it is a Windows box, the disks will most likely be Dynamic with Windows software RAID5 and NTFS. If you have no idea how to find this out, contact the NAS manufacturer. They should have this information, and you may even find it on their web pages.
2. Find out how to access the disks.
NAS disks can not be accessed directly by Windows running on your PC. For a RAID recovery you need to access the disks directly, so find out how to do this. In almost all cases the NAS itself has no features for this, so the disks need to be removed from the NAS and connected to a PC. One way to do this is to put the NAS disks in a USB enclosure and connect them to the PC that will run the recovery. This will work but it is always best to connect the disks as direct as possible. So for example, if the NAS disks are SATA disks, connect them to a PC that has enough SATA ports.
3. Know the state of the disks.
A common problem with a NAS is an unexpected power outage, causing the disks to become inaccessible. This type of failure can lead to disks being physically damaged, so before starting the recovery make sure the NAS disks are in good health. Run a SMART diagnosis or check the disk's surface. These things can only be done if the disk is connected directly, see item 2 above for details.
If a disk has health issues you must decide how to proceed. If the problems are small (just a few read errors, this can be determined by running a surface scan) you could go ahead with the recovery, iRecover can handle a few read errors. If however the disk is in bad shape, you should clone it before starting the recovery, or you can create an image file of the disk.
Remember: with RAID5 you can recover data when 1 disk is missing. So if 1 disk has mechanical problems, simply leave it out of the recovery procedure. In fact, that is pretty much mandatory: a disk with read problems will have a bad effect on the RAID analysis.

There are some special considerations when using iRecover to recover data from NAS disks, these will be mentioned in the walkthrough. In general though, recovering data from a NAS is the same as recovering data from any other failed RAID; the only extra work is removing the disks from the NAS and connecting them to a PC.

 
About a 'Degraded RAID'

Before we get to the walkthroughs a bit about a common RAID problem:

Degraded RAID

This is a bit of an umbrella term and can cover a few situations. In general a degraded RAID should not result in immediate data loss. The RAID is still working but there are conditions that require actions and the user should investigate. The most common example is a hardware RAID5 that has one failed disk. There are adapters that keep the RAID5 running with one failed disk, so the user can replace the disk. After replacing the disk it will be rebuilt by the RAID adapter (this can be done on the fly or not, that depends on the adapter) and once the rebuild is complete the RAID5 will be running as before.
Another example is a failed RAID1. Either the original or the mirror has stopped working, but the user can continue using the data. Again, there is no immediate data loss because there is a working duplicate, but the failed disk(s) should be replaced and the mirror re-instated.

Notes:

  • If more than one disk fails in a RAID5 set, it is game over: the RAID can not be rebuilt or recovered (*).
  • If a RAID1 fails because either the original or the mirror has failed, it may be required to actually disable the RAID1 and use the still good disk as a "normal" non-RAID disk. After replacing the failed disk the RAID1 can be re-instated and the mirror rebuilt.
  • As said before, a RAID0 has no control data so if it fails repairing or rebuilding is usually not an option. In some cases a software RAID0 can be brought back to life but in general it is best to use an external recovery tool and attempt to recover the data from the former RAID0 set.

(*) - This is not always the case. When RAID clustering is used you can lose more than one disk, but not more than one in the same cluster. Example: a RAID5 consisting of 6 disks, clustered in 2 groups of 3 disks each. If 1 disk fails in each group you're still fine. If 2 disks fail in 1 group it's game over. RAID clustering is used mainly in mainframes and is always done on a hardware level, so it isn't very common in the PC world. Mentioned here only for completion's sake.

Now on to the walkthroughs.

 
Walkthroughs

RAID5 recovery with all disks present (NTFS)

Creating and using a disk image file (example used: RAID5 recovery)

Checking a disk's physical state (S.M.A.R.T. status)

More walkthroughs coming soon

DIY DataRecovery. All rights reserved | about