cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md127 : active (auto-read-only) raid10 sdc1[0] sdb1[2]
1953382400 blocks super 1.2 512K chunks 2 far-copies [2/1] [U_]
resync=PENDING
bitmap: 15/15 pages [60KB], 65536KB chunk
Solution force repai........
In a RAID array I had a have periodically lost a drive here and there over the past several months. Iwas always able to readd and resync without losing data. However at some point it looks like some minor corruption happened and this makes DRBD unhappy.
Using fsck did not help either.
Dec 19 06:01:45 storageboxtest4 kernel: [19005.945890] EXT3-fs error (device drbd0): ext3_get_inode_loc: unable to read inode block - inode=22184379........
Remove the failed partition /dev/sde1
mdadm --manage /dev/md99 -r /dev/sde1
mdadm: hot removed /dev/sde1 from /dev/md99
Now add another drive back to replace it:
# mdadm --manage /dev/md99 -a /dev/sdf1
mdadm: added /dev/sdf1
A "cat /proc/mdstat" should show it resyncing if all is well.........
Tired of checking iotop and seeing that your drbd partition is using 99.99% of io all the time and finding your drbd device performs slow in general?
This is especially an issue in versions of DRBD in the 8.3 tree in particular one documented case is on "8.3.13" but it likely applies to other devices.
The symptoms are that resyncing is fine and normal but any reasonable amount of activity is very slow and lagged and creates a high server load and con........
The units in echo are kB as in kilobyte.
Setting a high sync speed
echo 120000 >/proc/sys/dev/raid/speed_limit_min
This will increase the speed, note that sometimes a rebuild is slow due to current disk activity/iowait.
If that is not the cause then you may have a hardware issue (controller, cable or a bad drive).
Setting a lower sync speed
echo 1200 >/proc/sys/dev/raid/speed_limit_max........
Have you ever unplugged the wrong drive and then had to rebuild the entire array? It may not be a big deal in some ways but it does make your system vulnerable until the rebuild is done.
Many distros often enable the "bitmap" feature and this basically keeps track of what parts need to be resynced in the case of a temporary removal of a drive from the array, this way it only needs to sync what has changed.
To enable bitmap to speed up rebuilds and sync........
mdadm --create /dev/md2 --level=1 --raid-devices=2 /dev/sda3 /dev/sdb3
cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sdb3[1] sda3[0]
1363020736 blocks super 1.2 [2/2] [UU]
[=>...................] resync = 8.3% (113597440/1363020736) finish=276.2min speed=75366K/sec
........
This happened during a RAID array check:
SMART says both drives pass the test, but I'm doing a long test on them and hopefully this is not a hardware error.
Apr 3 04:22:01 remote kernel: md: syncing RAID array md2
Apr 3 04:22:01 remote kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Apr 3 04:22:01 remote kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
Apr........
I think this will be useful to others because I have a server that kept crashing mysteriously during intense disk usage/RAID checks. It would only crash during the weekly RAID integrity check.
ThenI noticed during a reboot that not all CPUs were being brought up, as a result this actually creates much higher temperatures with the output I got from sensors, just booting the system produced higher than normal temperatures.
You can imagine that a full blown RAID check........
This made me nervous but it's clearly a cronjob based on the messages log that happens every Sunday at about 4:22.
I actually can't find any evidence of it in cron.d cron.daily but it is there somewhere obviously.
What I don't get is why doesn't this cronjob do a datacheck like Ubuntu's cronscript does? When you unnecessarily rebuild the array you lose your redundancy during that point which makes your data extremely vulnerable.
*Update I did a grep of &q........
This has stumped me a few times because I keep forgetting that Centos 5.5 comes with a default iptables configuration that ends up blocking DRBD traffic,I tried all the normal things and couldn't understand why I couldn't make my normal DRBD config work. So if you have WFConnection problems and have tried the normal "mailing list" fixes, check your firewall status first!
Both Nodes Say the Following:
version: 8.3.8 (api:88/prot........
Create New RAID 1 Array:
First setup your partitions (make sure they are exactly the same size)
In my example I have sda3 and sdb3 which are 500GB in size.
mdadm --create /dev/md2 --level=1 --raid-devices=2 /dev/sda3 /dev/sdb3
mdadm: array /dev/md2 started.
Check Status Of The Array
*Note I already have other arrays md0 and md1.
You can see below that md2 is syn........
If you have the "(auto-read-only)" beside an arrayI have no idea why that happens but it is easy to fix.
Just run "mdadm --readwrite /dev/md1" (rename md0 to the device with the problem and it will begin to resync.
md1 : active (auto-read-only) raid1 sdb2[0] sda2[1]
19534976 blocks [2/2] [UU]
resync=PENDING
........
mdadm --assemble --scan
mdadm: /dev/md/diaghost05102010:2 has been started with 2 drives.
mdadm: /dev/md/diaghost05102010:1 has been started with 2 drives.
mdadm: /dev/md/diaghost05102010:0 has been started with 2 drives.
-bash-3.1# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] [multipath]
md125 : active raid1 sda1[0] sdb1[1]
14658185 blocks super 1.2........