If disk errors are reported there may be H/W problems with the disk. Check dmesg for the following type of errors: `[737961.360080] raid5_end_read_request: 64 callbacks suppressed` `[737961.360087] md/raid:md125: read error corrected (8 sectors at 2722701256 on sdc1)` `[737961.360093] md/raid:md125: read error corrected (8 sectors at 2722701264 on sdc1)` `[737961.360095] md/raid:md125: read error corrected (8 sectors at 2722701272 on sdc1)` `[737961.360098] md/raid:md125: read error corrected (8 sectors at 2722701280 on sdc1)` `[737961.360100] md/raid:md125: read error corrected (8 sectors at 2722701288 on sdc1)` `[737961.360102] md/raid:md125: read error corrected (8 sectors at 2722701296 on sdc1)` `[737961.360105] md/raid:md125: read error corrected (8 sectors at 2722701304 on sdc1)` `[737961.360107] md/raid:md125: read error corrected (8 sectors at 2722701312 on sdc1)` `[737961.360109] md/raid:md125: read error corrected (8 sectors at 2722701320 on sdc1)` `[737961.360112] md/raid:md125: read error corrected (8 sectors at 2722701328 on sdc1)` `[742462.760119] md: md125: data-check done.` Use SMART to investigate the hard drive. `$ smartctl -i /dev/sdc` The drive can be tested via the following command `$ smartctl -t long /dev/sdc` The long test will take a while, there is also a short test which can be performed. The results can be viewed using: `$ smartctl -l selftest /dev/sdc` ` ` `smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-327.36.3.el7.x86_64] (local build)` `Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org` `` `=== START OF READ SMART DATA SECTION ===` `SMART Self-test log structure revision number 1` `Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error` `# 1 Extended offline Completed: read failure 40% 21930 2722703304` Thus this needs to be replaced. To find it can use hdparm to get the serial number. `$ hdparm -i /dev/sdc | grep SerialNo` `Model=ST2000DM001-1ER164, FwRev=CC27, SerialNo=Z4Z5QAY5` so before shutting down and replacing the drive mdadm is used to mark the drive as failed and it can be removed from the raid. `$ mdadm --manage /dev/md0 --fail /dev/sdc1` `$ mdadm --manage /dev/md0 --remove /dev/sdc1` Before the old drive is removed the partition table can be dumped using: `$ sfdisk -d /dev/sdc > sdc.out` Once the new drive has been swapped in, the old partition table can then be used on the new drive: `$ sfdisk -d /dev/sdc < sdc.out` The new disk is now ready to be included in the raid: `$ mdadm --manage /dev/md125 --add /dev/sdc1` Finally can monitor the progress of the rebuild using: `$ cat /proc/mdstat`