What happens when you unplug 1 or more devices from an mdadm RAID array to simulate a failure in Linux Ubuntu/Centos/Debian?


In short the two drives in the array were /dev/sdd and /dev/sde.  The kernel sees they were unplugged and have gone down as you can see below.
mdadm caught the first one being unplugged /dev/sde and disabled the missing drive.  However when the final drive that was part of the array is unplugged it didn't notice at all.  Instead it complains about an IO error later for drives that the kernel knows do not exist anymore.

[45817.162728] ata4: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen
[45817.162744] ata4: SError: { PHYRdyChg LinkSeq TrStaTrns }
[45817.162757] ata4: hard resetting link
[45817.162763] ata4: nv: skipping hardreset on occupied port
[45817.875776] ata4: SATA link down (SStatus 0 SControl 300)
[45822.876730] ata4: hard resetting link
[45822.876743] ata4: nv: skipping hardreset on occupied port
[45823.188801] ata4: SATA link down (SStatus 0 SControl 300)
[45823.188825] ata4: limiting SATA link speed to 1.5 Gbps
[45828.189782] ata4: hard resetting link
[45828.189796] ata4: nv: skipping hardreset on occupied port
[45828.501840] ata4: SATA link down (SStatus 0 SControl 300)
[45828.501862] ata4.00: disabled
[45828.501889] ata4: EH complete
[45828.501917] ata4.00: detaching (SCSI 4:0:0:0)
[45828.507327] sd 4:0:0:0: [sde] Synchronizing SCSI cache
[45828.507413] sd 4:0:0:0: [sde]
[45828.507419] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[45828.507425] sd 4:0:0:0: [sde] Stopping disk
[45828.507443] sd 4:0:0:0: [sde] START_STOP FAILED
[45828.507448] sd 4:0:0:0: [sde]
[45828.507451] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[45828.513870] md/raid10:md2: Disk failure on sde3, disabling device.
[45828.513870] md/raid10:md2: Operation continuing on 1 devices.
[45828.515601] md/raid1:md0: Disk failure on sde1, disabling device.
[45828.515601] md/raid1:md0: Operation continuing on 1 devices.
[45828.515699] md/raid1:md1: Disk failure on sde2, disabling device.
[45828.515699] md/raid1:md1: Operation continuing on 1 devices.
[45828.550897] RAID1 conf printout:
[45828.550907]  --- wd:1 rd:2
[45828.550914]  disk 0, wo:1, o:0, dev:sde1
[45828.550919]  disk 1, wo:0, o:1, dev:sdd1
[45828.550922] RAID1 conf printout:
[45828.550926]  --- wd:1 rd:2
[45828.550929]  disk 0, wo:1, o:0, dev:sde2
[45828.550933]  disk 1, wo:0, o:1, dev:sdd2
[45828.557889] RAID1 conf printout:
[45828.557891] RAID1 conf printout:
[45828.557898]  --- wd:1 rd:2
[45828.557901]  disk 1, wo:0, o:1, dev:sdd2
[45828.557908]  --- wd:1 rd:2
[45828.557913]  disk 1, wo:0, o:1, dev:sdd1
[45828.564720] RAID10 conf printout:
[45828.564742]  --- wd:1 rd:2
[45828.564749]  disk 0, wo:1, o:0, dev:sde3
[45828.564751]  disk 1, wo:0, o:1, dev:sdd3
[45828.569892] RAID10 conf printout:
[45828.569895]  --- wd:1 rd:2
[45828.569898]  disk 1, wo:0, o:1, dev:sdd3
[45828.584569] md: unbind

[45828.584599] md: unbind

[45828.601887] md: export_rdev(sde2)
[45828.606689] md: unbind

[45828.609925] md: export_rdev(sde1)
[45828.625934] md: export_rdev(sde3)
[45853.787165] ata3: exception Emask 0x10 SAct 0x0 SErr 0x1910000 action 0xe frozen
[45853.787181] ata3: SError: { PHYRdyChg Dispar LinkSeq TrStaTrns }
[45853.787194] ata3: hard resetting link
[45853.787199] ata3: nv: skipping hardreset on occupied port
[45854.502904] ata3: SATA link down (SStatus 0 SControl 300)
[45859.503897] ata3: hard resetting link
[45859.503910] ata3: nv: skipping hardreset on occupied port
[45859.815936] ata3: SATA link down (SStatus 0 SControl 300)
[45859.815959] ata3: limiting SATA link speed to 1.5 Gbps
[45864.816953] ata3: hard resetting link
[45864.816970] ata3: nv: skipping hardreset on occupied port
[45865.128991] ata3: SATA link down (SStatus 0 SControl 300)
[45865.129016] ata3.00: disabled
[45865.129049] ata3: EH complete
[45865.129082] ata3.00: detaching (SCSI 3:0:0:0)
[45865.134492] sd 3:0:0:0: [sdd] Synchronizing SCSI cache
[45865.134606] sd 3:0:0:0: [sdd]
[45865.134612] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[45865.134618] sd 3:0:0:0: [sdd] Stopping disk
[45865.134638] sd 3:0:0:0: [sdd] START_STOP FAILED
[45865.134643] sd 3:0:0:0: [sdd]
[45865.134647] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK


[46361.713303] Buffer I/O error on device md2, logical block 0
[46361.713321] Buffer I/O error on device md2, logical block 1
[46361.713333] Buffer I/O error on device md2, logical block 2
[46361.713343] Buffer I/O error on device md2, logical block 3
[46361.713352] Buffer I/O error on device md2, logical block 0
[46361.713374] Buffer I/O error on device md2, logical block 177343231
[46361.713383] Buffer I/O error on device md2, logical block 177343231
[46361.713479] Buffer I/O error on device md2, logical block 0
[46361.713491] Buffer I/O error on device md2, logical block 1
[46361.713500] Buffer I/O error on device md2, logical block 2


See below it believes the last drive (sdd) to be unplugged is still plugged in!


Personalities : [raid10] [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4]
md0 : active raid1 sdd1[2]
      20955008 blocks super 1.2 [2/1] [_U]
   
md1 : active raid1 sdd2[1]
      2097088 blocks [2/1] [_U]
   
md2 : active raid10 sdd3[2]
      709372928 blocks super 1.2 512K chunks 2 far-copies [2/1] [_U]
      bitmap: 0/6 pages [0KB], 65536KB chunk

 


Tags:

unplug, devices, mdadm, raid, array, simulate, linux, ubuntu, centos, debian, dev, sdd, sde, kernel, unplugged, disabled, didn, complains, io, ata, exception, emask, sact, serr, xe, serror, phyrdychg, linkseq, trstatrns, resetting, nv, skipping, hardreset, occupied, sata, sstatus, scontrol, limiting, gbps, detaching, scsi, synchronizing, cache, hostbyte, did_bad_target, driverbyte, driver_ok, disk, start_stop, md, disabling, continuing, conf, printout, wd, rd, wo, unbind, export_rdev, dispar, buffer, plugged, personalities, linear, multipath, active, _u, chunks, copies, bitmap, kb, chunk,

Latest Articles

  • ImageMagick Convert PDF Not Authorized
  • ImageMagick Converted PDF to JPEG some files have a black background solution
  • Linux Mint Mate Customize the Lock screen messages and hide username and real name
  • Ubuntu/Gnome/Mint/Centos How To Take a partial screenshot
  • ssh how to verify your host key / avoid MIM attacks
  • Cisco IP Phone CP-8845 8800/8900 Series How To Reset To Factory Settings Instructions
  • ls how to list ONLY directories
  • How to encrypt your SSH private key file id_rsa
  • Linux Mint 18 Disable User Name List from showing on Login Screen
  • Firefox Cannot Hit Enter Key In Address Bar and Location History Not Working
  • Cisco Unified Communications Manager / CUCM IP 8.6,10,12 Install Error Solution
  • Ubuntu Debian Mint Linux SSHD OpenSSH Server Not Starting After Reboot Solution
  • nmap how to scan for all ports and not just the 1000 most common ports
  • Windows 7,8,10 and Server 2008, 2012, 2016, 2019 Read Only Attribute Won't Go Away
  • bind / named how to make a wildcard record and retain defined A records
  • Cisco Unified Communications Manager 12 Install Errors on Proxmox/KVM
  • Local Vs Universally Administered MAC Address NIC Refuses to come up
  • Cisco Unified Communications Manager 12 CUCM 12 - How To Enable Video Calling
  • Windows 7, 8, 10, Windows Server 2008, 2012, 2016, 2019 How To AC97 Audio Drivers and Other Unsigned Drivers
  • Cisco Unified Communications Manager / CUCM IP Telephony Definitions