r5 - 27 Apr 2008 - 14:32:54 - JeffreyThompsonYou are here: TWiki >  Main Web > LinuxSpace > LinuxRaid5

How To Setup Linux Software RAID5 (or RAID0,1,4,6,10)

It's not difficult to get Software RAID on Linux working. I was surprised how easy it was. Also, after I got it working, it was amazingly fast. It was the largest and fastest Linux system I have seen. I used three SATA II 500gb drives which translated into 1 Terabyte of disk space.

I would recommend putting the Linux OS on a non-RAID drive and put your /home drive where all the user data and other important data resides. It's easier to upgrade your OS that way. It actually makes sense to me to put the OS on the RAID drive but I haven't figured out how to do that yet smile Maybe someone else can update this doc to include that instruction.

The first thing to do is to partition each of your new drives. cat /proc/diskstats to display your drive devices (mine are sda, sdb, sdc, sdd)

I put my operating system on the first drive, sda so I won't be touching that one.

  1. Next put the RAID partition on each drive. The partitions need to exactly match.
    • fdisk /dev/sdb5
    • Make a Primary Partiton (1) with partition type fd
    • Write the changes back to the partition (w)
  2. Next Install the Linux Software Raid Manager mdadm if you don't already have it installed: apt-get install mdadm
  3. Next Create the RAID5 (!!): mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/sdb5 /dev/sdc5 /dev/sdd5
  4. Restart the computer: sudo shutdown -r now
  5. Upon bootup, Linux will build the RAID; monitor with watch cat /proc/mdstat
  6. After the RAID has been created, create the filesystem, and I highly recommend ReiserFS filesystem: mkfs -t reiserfs /dev/md0
  7. Update your /etc/fstab file with your new RAID partition and its newly generated Volume UUID.
    • To determine the RAID Volume UUID do this: sudo vol_id /dev/md0 and look for the long string after ID_FS_UUID= Mine looks like: b1e9350c-9aac-4d6d-855f-9afaa9003069
    • Edit /etc/fstab and insert your entry for the RAID partition:
      # /dev/md0
      UUID=b1e9350c-9aac-4d6d-855f-9afaa9003069 /home reiserfs defaults 0 2
  8. Mount the drive: mount -a I think for some reason I had to reboot the computer before I could see the RAID array.
  9. df should show you /dev/md0 mounted on /home (or wherever you mounted your RAID partition

Closing The Loop: Test the RAID

I would encourage you to copy a bunch of data onto your new RAID partition, and then test a hard drive failure. I actually pulled the plug on my hard drive, but apparently you don't have to do that, and it's not recommended since it could damage the hard drive. Instead you can simulate a hard drive failure with the following command: mdadm -manage -set-faulty /dev/md0 /dev/sdc5

If this happened when you weren't expecting it, you would need to recognize that you have a hardware failure and you need to determine which hard drive failed. Here's what my /var/log/messages looked like when the hard drive lost power:

Dec 24 18:34:08 jtdesktop kernel: [80200.758231] ata5: port is slow to respond, please be patient (Status 0xd0)
Dec 24 18:34:13 jtdesktop kernel: [80205.737221] ata5: device not ready (errno=-16), forcing hardreset
Dec 24 18:34:13 jtdesktop kernel: [80205.737226] ata5: soft resetting port
Dec 24 18:34:43 jtdesktop kernel: [80235.970813] ata5.00: qc timeout (cmd 0xec)
Dec 24 18:34:43 jtdesktop kernel: [80235.970820] ata5.00: failed to IDENTIFY (I/O error, err_mask=0x5)
Dec 24 18:34:43 jtdesktop kernel: [80235.970827] ata5: failed to recover some devices, retrying in 5 secs
Dec 24 18:34:48 jtdesktop kernel: [80240.969772] ata5: soft resetting port
Dec 24 18:35:19 jtdesktop kernel: [80271.103460] ata5.00: qc timeout (cmd 0xec)
Dec 24 18:35:19 jtdesktop kernel: [80271.103467] ata5.00: failed to IDENTIFY (I/O error, err_mask=0x5)
Dec 24 18:35:19 jtdesktop kernel: [80271.103476] ata5.00: limiting speed to UDMA/133:PIO3
Dec 24 18:35:19 jtdesktop kernel: [80271.103478] ata5: failed to recover some devices, retrying in 5 secs
Dec 24 18:35:24 jtdesktop kernel: [80276.102420] ata5: soft resetting port
Dec 24 18:35:54 jtdesktop kernel: [80306.236097] ata5.00: qc timeout (cmd 0xec)
Dec 24 18:35:54 jtdesktop kernel: [80306.236105] ata5.00: failed to IDENTIFY (I/O error, err_mask=0x5)
Dec 24 18:35:54 jtdesktop kernel: [80306.236112] ata5.00: disabled
Dec 24 18:35:54 jtdesktop kernel: [80306.739597] ata5: EH complete
Dec 24 18:35:54 jtdesktop kernel: [80306.739610] sd 5:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Dec 24 18:35:54 jtdesktop kernel: [80306.739616] end_request: I/O error, dev sdc, sector 976767870
Dec 24 18:35:54 jtdesktop kernel: [80306.739628] md: super_written gets error=-5, uptodate=0
Dec 24 18:35:54 jtdesktop kernel: [80306.739646] sd 5:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Dec 24 18:35:54 jtdesktop kernel: [80306.739651] end_request: I/O error, dev sdc, sector 309314390
Dec 24 18:35:54 jtdesktop kernel: [80306.747576] sd 5:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Dec 24 18:35:54 jtdesktop kernel: [80306.747581] end_request: I/O error, dev sdc, sector 618004670
Dec 24 18:35:54 jtdesktop kernel: [80306.767746] RAID5 conf printout:
Dec 24 18:35:54 jtdesktop kernel: [80306.767749]  --- rd:3 wd:2
Dec 24 18:35:54 jtdesktop kernel: [80306.767751]  disk 0, o:1, dev:sdb5
Dec 24 18:35:54 jtdesktop kernel: [80306.767754]  disk 1, o:0, dev:sdc5
Dec 24 18:35:54 jtdesktop kernel: [80306.767755]  disk 2, o:1, dev:sdd5
Dec 24 18:35:54 jtdesktop kernel: [80306.783535] RAID5 conf printout:
Dec 24 18:35:54 jtdesktop kernel: [80306.783538]  --- rd:3 wd:2
Dec 24 18:35:54 jtdesktop kernel: [80306.783541]  disk 0, o:1, dev:sdb5
Dec 24 18:35:54 jtdesktop kernel: [80306.783543]  disk 2, o:1, dev:sdd5
Dec 24 18:57:45 jtdesktop -- MARK --

You can see from this that dev:sdc5 had problems and is no longer being used.

Also you can check with mdadm to see what the problem is:

  • sudo mdadm --misc --detail /dev/md0 and look for the State State : clean, degraded and the faulty drive, example: faulty /dev/sdc5
  • cat /proc/mdstat will show (F) next to the failed disk: md0 : active raid5 sdb5[0] sdc5[1](F) sdd5[2]

  • Next you need to remove the old disk drive and insert a new disk drive:
    • Delete the drive from the array: mdadm /dev/md0 -r /dev/sdc5
    • Check that it has been removed: mdadm --misc --detail /dev/md0
    • Add the new drive into the array: mdadm /dev/md0 -a /dev/sdc5
    • Check that it has been added: mdadm --misc --detail /dev/md0 you should see a spare rebuilding /dev/sdc5 at the bottom
    • Monitor the progress of the rebuild with: watch cat /proc/mdstat I love the little graph [==========>...............] showing the rebuild status. It grows ==> as the rebuild progresses.

Setup Monitoring

Have mdadm monitor your RAID and have it email you when there's a problem:

  • Setup an email program: apt-get install postfix is a good one
  • Test your email program: echo test 123 | mail myEmail@gmail.com
  • If your test email worked, now have mdadm send you a test message. I set the polling time to every ten minutes (600 seconds) instead of the default once a minute: sudo mdadm --monitor --mail myEmail@gmail.com --syslog --delay 600 --test /dev/md0
  • You can check the email log file: tail -f /var/log/mail.log
  • If the test notification succeeds then you give the command to become daemonised and it will fork a monitor program and check every ten minutes: sudo mdadm --monitor --mail myEmail@gmail.com --syslog --delay 600 --daemonise /dev/md0 you will probably want to put this command (without the sudo) in your /etc/rc.local to startup the monitoring when you boot your computer.

I hope you have as much success with this as I have had.

References

-- JeffreyThompson - 25 Dec 2007

Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r5 < r4 < r3 < r2 < r1 | More topic actions

tip TWiki Tip of the Day
Preference settings
TWiki has four levels of preferences settings: 1 Site level settings: Site name, proxy settings ... Read on Read more

 
Home
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback