No more data loss
After having lost so many hardisks in the last years and despite backup efforts, having lost some dear information, I just recently decided that instead of complaining, a solution must be found.
And that arrived in the shape of RAID, the Redundant Array of Inexpensive Disks. Or in my case it will probably be called RACD, the same as before, but with crappy instead of inexpensive. Because that is what consumer harddisks are today. The break quite frequently and consumers don't seem to mind much, loosing their precious (?) family pictures or other documentation.
Now RAID is a wonderful way to be able to loose one disk and still not have any dataloss. If you are not familiar with RAID, check out Wikipedia, it is all described there in more detail than you might like ;-)
The maxim of backup is that you should backup any information that is important to you and that you cannot/don't want to loose. Instead of doing copies manually all the time, RAID 1 does that with mirroring.
You have two disks that are mirrored by the system and act as one. You loose one, you still got that data intact.
The problem is, you need double as much disks as you want to use, in fact doubling the harddisk price.
In my case, I found this unacceptable that I have to pay double as much just because they just don't know how to produce reliable and stable disks anymore.
Yes, it can in fact be done, since I have noticed that harddisks used to be much more reliable about 10 years ago. I hardly ever had a disk back then fail me and they were the gold equivalent of floppies.
Now there is a thing called RAID 5, which drives down the overhead for extra unused harddisk capacity to 33% or even less.
RAID 5 is a little more CPU intensive and that is why most hardware controllers have extra ram and a dedicated CPU for it.
In our case, we will configure a software RAID 5 on Linux, with the following benefits:
- No extra cost for the controller card. RAID 5 controllers start from 400$ upwards.
- Transparency, because the system is normed and well documented.
- No single point of failure, when the controller fails. RAID hardware controllers are usualy using a proprietary format, without the exact same controller, the disks of the array are just unusable data garbage. When getting the same version of Linux, a software RAID can always be restored.
the drawbacks are:
- In stead of money spent on a controller card, knowhow is needed. Instead of spending money, you use brains. The advantage is, that you got this tutorial and that after learning it once, it stays the same always.
- RAID 5 puts strain on the cpu and slows down the disks.
One of the benefits of running Linux is that it uses the hardware much better.
Softraid creation (GUI)
Now before we start, let me mention that unfortunately there are no fancy or nice GUI tools to configure it. This is sad and a little suprising, considering that software RAID is hardly new and has matured quite a bit.
There are tools for setting it up when installing a new Linux, but the catch is, that these are not available when Linux is already installed. This is by the way true for a lot of other tools and I never understood why this is.
In my case, setting up a new system was out of the case, since I had one already and wanted to migrate it over.
I recently looked at a tool from IBM called EVMS (enterprise volume management system) that is in the debian repositories, which can do a lot more then create and manage RAID array, but could not get it to work properly. For some strange reason, the creation dialog always stayed empty and nothing could be done with it.
No error messages and punching up the debug level does not do anything.
Select active devices. In our case, we will select 3 active disks and 1 spare.
Now you got to select the three drives (partitions) that you want to use for the active drives.
Make sure you also select the spare.
Now you are finished defining the raid.
You will notice now that you got a new device in the partition list and that is the raid 5.
Just use that one as though it would be a disk, set a filesystem (ext3) and set the mountpoint as / (root)
The new installer from Etch will propose to use a LVM on the raid, which we will not do in this example. LVM does carry an overhead and the benefits using it with a raid are not that great.
(i.e. you cannot extend that root filesystem when you add another disk to the raid5 with lvm, since physical volumes cannot change size)
Finsih the rest of the setup like normal.
Linux Softraid creation (command-line)
First you got to decide if you want to use the whole disk (/dev/sdb) or individual partitions for the raid (/dev/sda1).
It depends what you want to do. If you just want an additional array to your bootdisk, that is not on raid, then you can use the whole disk.
If you want to boot from it, install the whole system on it, do remember that you cannot have the /boot partition on the raid 5, since grub cannot read it (at least all my attempts have failed).
Grub can read raid 1 though, so here is the deal:
So you make 3 partitions on every disk.
1. RAID 1 Boot (100 MBytes)
2. RAID 5 part for main drive (xyz GBytes)
3. RAID 5 part for SwAP (depending on your RAM, probably about 1 GByte)
The reason why we also put the SWAP partition on the raid is, that if you only put it on 1 disk, or even on all of them (using the mechanism swap gives you), one disk crashing will make your system to break instantly, since some parts of the memory might be swapped out on one of these disks that just went to heaven.
You might want to consider doing the SWAP in raid 1 instead of raid 5, because performance might be bad (in my case, it has not hurt that much yet, but I try to avoid running into SWAP). And having two active array on 1 disk is of course not ideal.
I recommend using gparted to create the partitions. It is currently the only GUI tool for a non-expert to use (or even an expert).
Create the partitions and remember, you do not have to format it, since we will then create the filesystem on the raid when it is assembled. Just leave it unformated. You can of course format it, but when using fdisk later on, it will look confusing, since you won't be able to see immediately that it is a raid.
Make sure you set the flag for each partiton to "raid" as seen below. It is also called "linux softraid autodetect" and the partitontype in hex is "fd", in case you want to use fdisk or parted to set it up.
now, we will create the raid array with the command
mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb2 /dev/sdc2 /dev/sdd2 --auto=md
You can examine the array with
mdadm --detail /dev/md0
and see all the details above.
In the above example, the raid assumes (not correctly) that I also want a spare device and when it starts, it immediately realizes that it is degraded (since we need a minimum of 3 drives for a raid 5) and includes the spare into the array.
In the end you have 3 drives active.
In case you want 4 or more component devices and want like me no spare, you have to specify this explicitly.
You might have noticed, that I used the device directly, not creating the 3 partitons in the above example and you are right: Just doing it for this example, to simplify matters.
Now you have your shining new array, all 3 disks rolled into one, with redundancy, a raid 5, /dev/md0.
Start up gparted and format the device with ext3 or your favorite filesystem and mount it on your system.
Raid 5 is a little slower than direct hd access, but for me the advantages are paramount. You can lose any one hd and walk away from it without losing your data.
Softraid Extension (online)Incredible enough, you actually can extend a softraid for RAID 5 as of Kernel 2.6.17
Mind you, do a backup first. I have performed the trick only 2 times and although I have not much doubt that it will flawlessly work, I would not stake my precious information and data on that.
You just have to partition the harddisk with GParted (see screenshots), set partition to unformated and change the flags to Linux RAID and then add the disk with
mdadm --add /dev/md0 /dev/hda1
Now you got a spare drive, which was not what I wanted, so lets activate it (and live with no spare):
mdadm --grow /dev/md0 -n4
(I had 3 drives before and this one is the 4th to explain)
The array gets resized online, no unmounting the filesystem or shutting down the machine... Neat!!!
Now you still got to resize the filesystem on the larger partition to take advantage of it all and that can be done in some newer kernel versions (and an ext3 that was created with it).
Backup your data (important!) and use resize2fs /dev/md0 and it will extend the filesystem to the full size of the device (if it is not yet).
Good old Proc
While the array gets created, you can watch its progress by opening another shell and using the command
watch cat /proc/mdstat
(watch, very handy, will execute the command cat every 2 seconds, stating the file /proc/mdstat, which shows in a very nice way what the array is doing)
Opening this in a separate shell, you can still work while showing it.
this document was created on:
10. Oct. 2006
09. May. 2007