What happens when you unplug 1 or more devices from an mdadm RAID array to simulate a failure in Linux Ubuntu/Centos/Debian?


In short the two drives in the array were /dev/sdd and /dev/sde.  The kernel sees they were unplugged and have gone down as you can see below.
mdadm caught the first one being unplugged /dev/sde and disabled the missing drive.  However when the final drive that was part of the array is unplugged it didn't notice at all.  Instead it complains about an IO error later for drives that the kernel knows do not exist anymore.

[45817.162728] ata4: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen
[45817.162744] ata4: SError: { PHYRdyChg LinkSeq TrStaTrns }
[45817.162757] ata4: hard resetting link
[45817.162763] ata4: nv: skipping hardreset on occupied port
[45817.875776] ata4: SATA link down (SStatus 0 SControl 300)
[45822.876730] ata4: hard resetting link
[45822.876743] ata4: nv: skipping hardreset on occupied port
[45823.188801] ata4: SATA link down (SStatus 0 SControl 300)
[45823.188825] ata4: limiting SATA link speed to 1.5 Gbps
[45828.189782] ata4: hard resetting link
[45828.189796] ata4: nv: skipping hardreset on occupied port
[45828.501840] ata4: SATA link down (SStatus 0 SControl 300)
[45828.501862] ata4.00: disabled
[45828.501889] ata4: EH complete
[45828.501917] ata4.00: detaching (SCSI 4:0:0:0)
[45828.507327] sd 4:0:0:0: [sde] Synchronizing SCSI cache
[45828.507413] sd 4:0:0:0: [sde]
[45828.507419] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[45828.507425] sd 4:0:0:0: [sde] Stopping disk
[45828.507443] sd 4:0:0:0: [sde] START_STOP FAILED
[45828.507448] sd 4:0:0:0: [sde]
[45828.507451] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[45828.513870] md/raid10:md2: Disk failure on sde3, disabling device.
[45828.513870] md/raid10:md2: Operation continuing on 1 devices.
[45828.515601] md/raid1:md0: Disk failure on sde1, disabling device.
[45828.515601] md/raid1:md0: Operation continuing on 1 devices.
[45828.515699] md/raid1:md1: Disk failure on sde2, disabling device.
[45828.515699] md/raid1:md1: Operation continuing on 1 devices.
[45828.550897] RAID1 conf printout:
[45828.550907]  --- wd:1 rd:2
[45828.550914]  disk 0, wo:1, o:0, dev:sde1
[45828.550919]  disk 1, wo:0, o:1, dev:sdd1
[45828.550922] RAID1 conf printout:
[45828.550926]  --- wd:1 rd:2
[45828.550929]  disk 0, wo:1, o:0, dev:sde2
[45828.550933]  disk 1, wo:0, o:1, dev:sdd2
[45828.557889] RAID1 conf printout:
[45828.557891] RAID1 conf printout:
[45828.557898]  --- wd:1 rd:2
[45828.557901]  disk 1, wo:0, o:1, dev:sdd2
[45828.557908]  --- wd:1 rd:2
[45828.557913]  disk 1, wo:0, o:1, dev:sdd1
[45828.564720] RAID10 conf printout:
[45828.564742]  --- wd:1 rd:2
[45828.564749]  disk 0, wo:1, o:0, dev:sde3
[45828.564751]  disk 1, wo:0, o:1, dev:sdd3
[45828.569892] RAID10 conf printout:
[45828.569895]  --- wd:1 rd:2
[45828.569898]  disk 1, wo:0, o:1, dev:sdd3
[45828.584569] md: unbind

[45828.584599] md: unbind

[45828.601887] md: export_rdev(sde2)
[45828.606689] md: unbind

[45828.609925] md: export_rdev(sde1)
[45828.625934] md: export_rdev(sde3)
[45853.787165] ata3: exception Emask 0x10 SAct 0x0 SErr 0x1910000 action 0xe frozen
[45853.787181] ata3: SError: { PHYRdyChg Dispar LinkSeq TrStaTrns }
[45853.787194] ata3: hard resetting link
[45853.787199] ata3: nv: skipping hardreset on occupied port
[45854.502904] ata3: SATA link down (SStatus 0 SControl 300)
[45859.503897] ata3: hard resetting link
[45859.503910] ata3: nv: skipping hardreset on occupied port
[45859.815936] ata3: SATA link down (SStatus 0 SControl 300)
[45859.815959] ata3: limiting SATA link speed to 1.5 Gbps
[45864.816953] ata3: hard resetting link
[45864.816970] ata3: nv: skipping hardreset on occupied port
[45865.128991] ata3: SATA link down (SStatus 0 SControl 300)
[45865.129016] ata3.00: disabled
[45865.129049] ata3: EH complete
[45865.129082] ata3.00: detaching (SCSI 3:0:0:0)
[45865.134492] sd 3:0:0:0: [sdd] Synchronizing SCSI cache
[45865.134606] sd 3:0:0:0: [sdd]
[45865.134612] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[45865.134618] sd 3:0:0:0: [sdd] Stopping disk
[45865.134638] sd 3:0:0:0: [sdd] START_STOP FAILED
[45865.134643] sd 3:0:0:0: [sdd]
[45865.134647] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK


[46361.713303] Buffer I/O error on device md2, logical block 0
[46361.713321] Buffer I/O error on device md2, logical block 1
[46361.713333] Buffer I/O error on device md2, logical block 2
[46361.713343] Buffer I/O error on device md2, logical block 3
[46361.713352] Buffer I/O error on device md2, logical block 0
[46361.713374] Buffer I/O error on device md2, logical block 177343231
[46361.713383] Buffer I/O error on device md2, logical block 177343231
[46361.713479] Buffer I/O error on device md2, logical block 0
[46361.713491] Buffer I/O error on device md2, logical block 1
[46361.713500] Buffer I/O error on device md2, logical block 2


See below it believes the last drive (sdd) to be unplugged is still plugged in!


Personalities : [raid10] [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4]
md0 : active raid1 sdd1[2]
      20955008 blocks super 1.2 [2/1] [_U]
   
md1 : active raid1 sdd2[1]
      2097088 blocks [2/1] [_U]
   
md2 : active raid10 sdd3[2]
      709372928 blocks super 1.2 512K chunks 2 far-copies [2/1] [_U]
      bitmap: 0/6 pages [0KB], 65536KB chunk

 


Tags:

unplug, devices, mdadm, raid, array, simulate, linux, ubuntu, centos, debian, dev, sdd, sde, kernel, unplugged, disabled, didn, complains, io, ata, exception, emask, sact, serr, xe, serror, phyrdychg, linkseq, trstatrns, resetting, nv, skipping, hardreset, occupied, sata, sstatus, scontrol, limiting, gbps, detaching, scsi, synchronizing, cache, hostbyte, did_bad_target, driverbyte, driver_ok, disk, start_stop, md, disabling, continuing, conf, printout, wd, rd, wo, unbind, export_rdev, dispar, buffer, plugged, personalities, linear, multipath, active, _u, chunks, copies, bitmap, kb, chunk,

Latest Articles

  • Why SMART is not smart at all and doesn't properly predict disk errors that cause a kernel panic or crash
  • scp: ambiguous target error and solution
  • VirtualBox How To Add iSCSI Storage using VBoxManage
  • iSCSI on Centos 7 Configuration and Setup Guide for Initiator and Target
  • Python and BeautifulSoup4's BS4's Decompose Method To Remove Unwanted Inner Tags
  • httpd AH00534: httpd: Configuration error: No MPM loaded. solution
  • bash script to remove modules from httpd.conf that are not actually installed
  • bash scripting how to create a function
  • Centos 7 PHP MySQL Not Working Solution
  • Bash How To Cut or Split Natively And Get The LAST Field
  • Bash Script How To Manipulate Text/Strings By Searchig and Replacing Natively
  • How Does Cisco CUCM (Cisco Unified Communication Manager) Work?
  • What DNS Options Does Active Directory Offer in Windows Server 2008,2012,2016 ?
  • syntax error, unexpected T_SL in PHP Solution
  • grep regular expression match number range between specific numbers
  • bash how to print out lines of text within a range from the first occurrence
  • bash script how to to check LAN computers for open ports
  • MySQL Using mytop Debug Source of High IO and Slow Performance
  • How To Mathematically Convert and Calculate Binary Value To Decimal Value
  • systemd management using systemctl and journalctl to check systemd logs