Why SMART is not smart at all and doesn't properly predict disk errors that cause a kernel panic or crash

Before getting into the output here is my typical experience with SMART, there is what I call a "bad disk" with pending and uncorrectable sectors that cannot be reallocated.
It has caused a kernel panic and system crash repeatedly as we can see from the logs.
But SMART says it has "PASSED" its self assessment.  SMART is still useful to me but it is more about looking at Current_Pending_Sector.
Any time I have had anything but 0 for that attribute it means the disk is bad and is unusable (eg. will cause kernel panics).
In this case even RAID doesn't help when the bad disk taints the kernel.

First let's check this disk and see what SMART thinks

smartctl -a /dev/sda

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda ES
Device Model:     ST3750640NS
Serial Number:    ABCAEAAA
LU WWN Device Id: 5 000c50 0083422e5
Firmware Version: 3BKH
User Capacity:    750,156,374,016 bytes [750 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Dec 13 12:43:37 2018 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   093   086   006    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0003   091   091   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       27
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   090   060   030    Pre-fail  Always       -       951683243
  9 Power_On_Hours          0x0032   052   052   000    Old_age   Always       -       42128
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       27
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   066   054   045    Old_age   Always       -       34 (Min/Max 28/36)
194 Temperature_Celsius     0x0022   034   046   000    Old_age   Always       -       34 (0 17 0 0 0)
195 Hardware_ECC_Recovered  0x001a   081   055   000    Old_age   Always       -       220199
197 Current_Pending_Sector  0x0012   096   096   000    Old_age   Always       -       93
198 Offline_Uncorrectable   0x0010   096   096   000    Old_age   Offline      -       93
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       971
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

Now let's see /var/log/messages

Dec 12 05:29:46 somepoorbox kernel: [30883839.026190] sd 0:0:0:0: [sda]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Dec 12 05:29:46 somepoorbox kernel: [30883839.026196] sd 0:0:0:0: [sda]  Sense Key : Medium Error [current] [descriptor] Dec 12 05:29:46 somepoorbox kernel: [30883839.026203] Descriptor sense data with sense descriptors (in hex): Dec 12 05:29:46 somepoorbox kernel: [30883839.026206]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Dec 12 05:29:46 somepoorbox kernel: [30883839.026215]         57 4f 86 7b Dec 12 05:29:46 somepoorbox kernel: [30883839.026219] sd 0:0:0:0: [sda]  Add. Sense: Unrecovered read error - auto reallocate failed Dec 12 05:29:46 somepoorbox kernel: [30883839.026225] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 57 4f 8a 43 00 03 38 00 Dec 12 05:29:46 somepoorbox kernel: [30883839.026236] end_request: I/O error, dev sda, sector 1464830531 Dec 12 05:29:46 somepoorbox kernel: [30883839.026331] block drbd0: disk( UpToDate -> Failed ) Dec 12 05:29:46 somepoorbox kernel: [30883839.026345] block drbd0: Local IO failed in __req_mod. Detaching... Dec 12 05:29:46 somepoorbox kernel: [30883839.026365] block drbd0: helper command: /sbin/drbdadm pri-on-incon-degr minor-0 Dec 12 05:29:46 somepoorbox kernel: [30883839.026476] sd 0:0:0:0: [sda]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Dec 12 05:29:46 somepoorbox kernel: [30883839.026480] sd 0:0:0:0: [sda]  Sense Key : Medium Error [current] [descriptor] Dec 12 05:29:46 somepoorbox kernel: [30883839.026485] Descriptor sense data with sense descriptors (in hex): Dec 12 05:29:46 somepoorbox kernel: [30883839.026488]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Dec 12 05:29:46 somepoorbox kernel: [30883839.026497]         57 4f 86 7b Dec 12 05:29:46 somepoorbox kernel: [30883839.026501] sd 0:0:0:0: [sda]  Add. Sense: Unrecovered read error - auto reallocate failed Dec 12 05:29:46 somepoorbox kernel: [30883839.026506] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 57 4f 86 7b 00 03 c8 00 Dec 12 05:29:46 somepoorbox kernel: [30883839.026514] end_request: I/O error, dev sda, sector 1464829563 Dec 12 05:29:46 somepoorbox kernel: [30883839.026632] block drbd0: IO ERROR: neither local nor remote disk Dec 12 05:29:46 somepoorbox kernel: [30883839.026636] ata1: EH complete Dec 12 05:29:46 somepoorbox kernel: [30883839.026728] block drbd0: IO ERROR: neither local nor remote disk Dec 12 05:29:46 somepoorbox kernel: [30883839.026811] block drbd0: IO ERROR: neither local nor remote disk Dec 12 05:29:46 somepoorbox kernel: [30883839.162977] Buffer I/O error on device drbd0, logical block 53203520 Dec 12 05:29:46 somepoorbox kernel: [30883839.163110] lost page write due to I/O error on drbd0 Dec 12 05:29:46 somepoorbox kernel: [30883839.163117] Buffer I/O error on device drbd0, logical block 59744311 Dec 12 05:29:46 somepoorbox kernel: [30883839.163200] lost page write due to I/O error on drbd0 Dec 12 05:29:46 somepoorbox kernel: [30883839.163208] Buffer I/O error on device drbd0, logical block 59744312 Dec 12 05:29:46 somepoorbox kernel: [30883839.163289] lost page write due to I/O error on drbd0 Dec 12 05:29:46 somepoorbox kernel: [30883839.163299] Buffer I/O error on device drbd0, logical block 59746338 Dec 12 05:29:46 somepoorbox kernel: [30883839.163316] Buffer I/O error on device drbd0, logical block 59744312 Dec 12 05:29:46 somepoorbox kernel: [30883839.163320] lost page write due to I/O error on drbd0 Dec 12 05:29:46 somepoorbox kernel: [30883839.163328] EXT3-fs: ext3_journal_dirty_data: aborting transaction: IO failure in ext3_journal_dirty_data Dec 12 05:29:46 somepoorbox kernel: [30883839.163336] EXT3-fs (drbd0): error in ext3_orphan_add: Readonly filesystem Dec 12 05:29:46 somepoorbox kernel: [30883839.165257]  [] ? warn_slowpath_common+0x91/0xe0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165260] EXT3-fs (drbd0): I/O error while writing superblock Dec 12 05:29:46 somepoorbox kernel: [30883839.165280]  [] ? ext3_get_group_desc+0x51/0xa0 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165285] JBD: Spotted dirty metadata buffer (dev = drbd0, blocknr = 0). There's a risk of filesystem corruption in case of system crash. Dec 12 05:29:46 somepoorbox kernel: [30883839.165292]  [] ? warn_slowpath_null+0x1a/0x20 Dec 12 05:29:46 somepoorbox kernel: [30883839.165297]  [] ? mark_buffer_dirty+0x82/0xa0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165316]  [] ? ext3_commit_super.clone.0+0x69/0x100 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165329]  [] ? ext3_handle_error+0x7f/0xe0 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165343]  [] ? __ext3_std_error+0x5e/0xb0 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165356]  [] ? ext3_orphan_add+0xbf/0x1a0 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165360] EXT3-fs: ext3_journal_dirty_data: aborting transaction: IO failure in ext3_journal_dirty_data Dec 12 05:29:46 somepoorbox kernel: [30883839.165374]  [] ? journal_dirty_data_fn+0x0/0x30 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165378] EXT3-fs (drbd0): error in ext3_orphan_add: Readonly filesystem [] ? ext3_ordered_write_end+0x158/0x1c0 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165395] Dec 12 05:29:46 somepoorbox kernel: [30883839.165400]  [] ? generic_file_buffered_write_iter+0x184/0x2b0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165407]  [] ? __generic_file_write_iter+0x225/0x420 Dec 12 05:29:46 somepoorbox kernel: [30883839.165412]  [] ? __generic_file_aio_write+0x85/0xa0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165417]  [] ? generic_file_aio_write+0x88/0x100 Dec 12 05:29:46 somepoorbox kernel: [30883839.165423]  [] ? do_sync_write+0xf2/0x140 Dec 12 05:29:46 somepoorbox kernel: [30883839.165432]  [] ? sys_getpeername+0xd4/0xf0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165436]  [] ? vfs_write+0xb8/0x1a0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165441]  [] ? fget_light_pos+0x16/0x50 Dec 12 05:29:46 somepoorbox kernel: [30883839.165445]  [] ? sys_write+0x51/0xb0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165450]  [] ? __audit_syscall_exit+0x25e/0x290 Dec 12 05:29:46 somepoorbox kernel: [30883839.165455]  [] ? system_call_fastpath+0x16/0x1b Dec 12 05:29:46 somepoorbox kernel: [30883839.165459] ---[ end trace 32aa3e2dc89d4c30 ]--- Dec 12 05:29:46 somepoorbox kernel: [30883839.165462] Tainting kernel with flag 0x9   

Tags:

doesn, predict, disk, errors, kernel, output, quot, pending, uncorrectable, sectors, reallocated, repeatedly, logs, assessment, current_pending_sector, attribute, unusable, eg, panics, raid, taints, smartctl, dev, sda, seagate, barracuda, es, ns, abcaeaaa, lu, wwn, firmware, bkh, user, capacity, bytes, gb, sector, database, ata, specification, draft, indicated, thu, dec, est, capability, enabled, overall, attribute_name, thresh, updated, when_failed, raw_value, raw_read_error_rate, spin_up_time, start_stop_count, old_age, reallocated_sector_ct, seek_error_rate, power_on_hours, spin_retry_count, power_cycle_count, reported_uncorrect, high_fly_writes, airflow_temperature_cel, min, temperature_celsius, hardware_ecc_recovered, offline_uncorrectable, offline, udma_crc_error_count, multi_zone_error_rate, data_address_mark_errs, var, somepoorbox, hostbyte, did_ok, driverbyte, driver_sense, medium, descriptor, descriptors, hex, unrecovered, auto, reallocate, cdb, end_request, drbd, uptodate, io, __req_mod, detaching, helper, sbin, drbdadm, pri, incon, degr, buffer, ext, fs, _journal_dirty_data, aborting, transaction, _orphan_add, readonly, filesystem, ffffffff, warn_slowpath_common, xe, superblock, ffffffffa, _get_group_desc, xa, jbd, metadata, blocknr, corruption, warn_slowpath_null, mark_buffer_dirty, ab, _commit_super, clone, ff, _handle_error, __ext, _std_error, xb, ebbf, xbf, dc, journal_dirty_data_fn, _ordered_write_end, generic_file_buffered_write_iter, __generic_file_write_iter, __generic_file_aio_write, generic_file_aio_write, do_sync_write, xf, sys_getpeername, xd, vfs_write, fget_light_pos, sys_write, eee, __audit_syscall_exit, system_call_fastpath, aa, tainting,

Latest Articles

  • How to install Windows or other OS and then bring to another computer by using a physical drive and Virtual Machine with QEMU
  • PXE-E23 Error BOOTx64.EFI GRUB booting is 0 bytes tftp pxe dhcp solution NBP filesize is 0 Bytes
  • vagrant install on Debian Mint Ubuntu Linux RHEL Quick Setup Guide Tutorial
  • RHEL 8 CentOS 8, Alma Linux 8, Rocky Linux 8 System Not Booting with RAID or on other servers/computers Solution for dracut and initramfs missing kernel modules
  • How to Upgrade to Debian 11 from Version 8,9,10
  • Ubuntu Linux Mint Debian Redhat Cannot View Files on Android iPhone USB File Transfer Not Working Solution
  • Virtualbox Best Networking Mode In Lab/Work Environment without using NAT Network or Bridged
  • debootstrap how to install Ubuntu, Mint, Debian install
  • Linux grub not using UUID for the root device instead it uses /dev/sda1 or other device name solution
  • How To Restore Partition Table on Running Linux Mint Ubuntu Debian Machine
  • Debian Ubuntu apt install stop daemon questions/accept the default action without prompting
  • iptables NAT how to enable PPTP in newer Debian/Ubuntu/Mint Kernels Linux
  • Grandstream Phone Vulnerability Security Issue Remote Backdoor Connection to 207.246.119.209:3478
  • Linux How to Check Which NIC is Onboard eth0 or eth1 Ubuntu Centos Debian Mint
  • VboxManage VirtualBox NAT Network Issues Managment Troubleshooting
  • Dell PowerEdge Server iDRAC Remote KVM/IP Default Username, Password Reset and Login Information Solution
  • Nvidia Tesla GPUs K40/K80/M40/P40/P100/V100 at home/desktop hacking, cooling, powering, cable solutions Tutorial AIO Solutions
  • Stop ls in Linux Debian Mint CentOS Ubuntu from applying quotes around filenames and directory names
  • Thunderbird Attachment Download Error Corrupt Wrong filesize of 29 or 27 bytes Solution
  • Generic IP Camera LAN Default IP Settings DVR