Why SMART is not smart at all and doesn't properly predict disk errors that cause a kernel panic or crash

Before getting into the output here is my typical experience with SMART, there is what I call a "bad disk" with pending and uncorrectable sectors that cannot be reallocated.
It has caused a kernel panic and system crash repeatedly as we can see from the logs.
But SMART says it has "PASSED" its self assessment.  SMART is still useful to me but it is more about looking at Current_Pending_Sector.
Any time I have had anything but 0 for that attribute it means the disk is bad and is unusable (eg. will cause kernel panics).
In this case even RAID doesn't help when the bad disk taints the kernel.

First let's check this disk and see what SMART thinks

smartctl -a /dev/sda

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda ES
Device Model:     ST3750640NS
Serial Number:    ABCAEAAA
LU WWN Device Id: 5 000c50 0083422e5
Firmware Version: 3BKH
User Capacity:    750,156,374,016 bytes [750 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Dec 13 12:43:37 2018 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   093   086   006    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0003   091   091   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       27
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   090   060   030    Pre-fail  Always       -       951683243
  9 Power_On_Hours          0x0032   052   052   000    Old_age   Always       -       42128
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       27
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   066   054   045    Old_age   Always       -       34 (Min/Max 28/36)
194 Temperature_Celsius     0x0022   034   046   000    Old_age   Always       -       34 (0 17 0 0 0)
195 Hardware_ECC_Recovered  0x001a   081   055   000    Old_age   Always       -       220199
197 Current_Pending_Sector  0x0012   096   096   000    Old_age   Always       -       93
198 Offline_Uncorrectable   0x0010   096   096   000    Old_age   Offline      -       93
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       971
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

Now let's see /var/log/messages

Dec 12 05:29:46 somepoorbox kernel: [30883839.026190] sd 0:0:0:0: [sda]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Dec 12 05:29:46 somepoorbox kernel: [30883839.026196] sd 0:0:0:0: [sda]  Sense Key : Medium Error [current] [descriptor] Dec 12 05:29:46 somepoorbox kernel: [30883839.026203] Descriptor sense data with sense descriptors (in hex): Dec 12 05:29:46 somepoorbox kernel: [30883839.026206]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Dec 12 05:29:46 somepoorbox kernel: [30883839.026215]         57 4f 86 7b Dec 12 05:29:46 somepoorbox kernel: [30883839.026219] sd 0:0:0:0: [sda]  Add. Sense: Unrecovered read error - auto reallocate failed Dec 12 05:29:46 somepoorbox kernel: [30883839.026225] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 57 4f 8a 43 00 03 38 00 Dec 12 05:29:46 somepoorbox kernel: [30883839.026236] end_request: I/O error, dev sda, sector 1464830531 Dec 12 05:29:46 somepoorbox kernel: [30883839.026331] block drbd0: disk( UpToDate -> Failed ) Dec 12 05:29:46 somepoorbox kernel: [30883839.026345] block drbd0: Local IO failed in __req_mod. Detaching... Dec 12 05:29:46 somepoorbox kernel: [30883839.026365] block drbd0: helper command: /sbin/drbdadm pri-on-incon-degr minor-0 Dec 12 05:29:46 somepoorbox kernel: [30883839.026476] sd 0:0:0:0: [sda]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Dec 12 05:29:46 somepoorbox kernel: [30883839.026480] sd 0:0:0:0: [sda]  Sense Key : Medium Error [current] [descriptor] Dec 12 05:29:46 somepoorbox kernel: [30883839.026485] Descriptor sense data with sense descriptors (in hex): Dec 12 05:29:46 somepoorbox kernel: [30883839.026488]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Dec 12 05:29:46 somepoorbox kernel: [30883839.026497]         57 4f 86 7b Dec 12 05:29:46 somepoorbox kernel: [30883839.026501] sd 0:0:0:0: [sda]  Add. Sense: Unrecovered read error - auto reallocate failed Dec 12 05:29:46 somepoorbox kernel: [30883839.026506] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 57 4f 86 7b 00 03 c8 00 Dec 12 05:29:46 somepoorbox kernel: [30883839.026514] end_request: I/O error, dev sda, sector 1464829563 Dec 12 05:29:46 somepoorbox kernel: [30883839.026632] block drbd0: IO ERROR: neither local nor remote disk Dec 12 05:29:46 somepoorbox kernel: [30883839.026636] ata1: EH complete Dec 12 05:29:46 somepoorbox kernel: [30883839.026728] block drbd0: IO ERROR: neither local nor remote disk Dec 12 05:29:46 somepoorbox kernel: [30883839.026811] block drbd0: IO ERROR: neither local nor remote disk Dec 12 05:29:46 somepoorbox kernel: [30883839.162977] Buffer I/O error on device drbd0, logical block 53203520 Dec 12 05:29:46 somepoorbox kernel: [30883839.163110] lost page write due to I/O error on drbd0 Dec 12 05:29:46 somepoorbox kernel: [30883839.163117] Buffer I/O error on device drbd0, logical block 59744311 Dec 12 05:29:46 somepoorbox kernel: [30883839.163200] lost page write due to I/O error on drbd0 Dec 12 05:29:46 somepoorbox kernel: [30883839.163208] Buffer I/O error on device drbd0, logical block 59744312 Dec 12 05:29:46 somepoorbox kernel: [30883839.163289] lost page write due to I/O error on drbd0 Dec 12 05:29:46 somepoorbox kernel: [30883839.163299] Buffer I/O error on device drbd0, logical block 59746338 Dec 12 05:29:46 somepoorbox kernel: [30883839.163316] Buffer I/O error on device drbd0, logical block 59744312 Dec 12 05:29:46 somepoorbox kernel: [30883839.163320] lost page write due to I/O error on drbd0 Dec 12 05:29:46 somepoorbox kernel: [30883839.163328] EXT3-fs: ext3_journal_dirty_data: aborting transaction: IO failure in ext3_journal_dirty_data Dec 12 05:29:46 somepoorbox kernel: [30883839.163336] EXT3-fs (drbd0): error in ext3_orphan_add: Readonly filesystem Dec 12 05:29:46 somepoorbox kernel: [30883839.165257]  [] ? warn_slowpath_common+0x91/0xe0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165260] EXT3-fs (drbd0): I/O error while writing superblock Dec 12 05:29:46 somepoorbox kernel: [30883839.165280]  [] ? ext3_get_group_desc+0x51/0xa0 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165285] JBD: Spotted dirty metadata buffer (dev = drbd0, blocknr = 0). There's a risk of filesystem corruption in case of system crash. Dec 12 05:29:46 somepoorbox kernel: [30883839.165292]  [] ? warn_slowpath_null+0x1a/0x20 Dec 12 05:29:46 somepoorbox kernel: [30883839.165297]  [] ? mark_buffer_dirty+0x82/0xa0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165316]  [] ? ext3_commit_super.clone.0+0x69/0x100 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165329]  [] ? ext3_handle_error+0x7f/0xe0 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165343]  [] ? __ext3_std_error+0x5e/0xb0 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165356]  [] ? ext3_orphan_add+0xbf/0x1a0 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165360] EXT3-fs: ext3_journal_dirty_data: aborting transaction: IO failure in ext3_journal_dirty_data Dec 12 05:29:46 somepoorbox kernel: [30883839.165374]  [] ? journal_dirty_data_fn+0x0/0x30 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165378] EXT3-fs (drbd0): error in ext3_orphan_add: Readonly filesystem [] ? ext3_ordered_write_end+0x158/0x1c0 [ext3] Dec 12 05:29:46 somepoorbox kernel: [30883839.165395] Dec 12 05:29:46 somepoorbox kernel: [30883839.165400]  [] ? generic_file_buffered_write_iter+0x184/0x2b0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165407]  [] ? __generic_file_write_iter+0x225/0x420 Dec 12 05:29:46 somepoorbox kernel: [30883839.165412]  [] ? __generic_file_aio_write+0x85/0xa0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165417]  [] ? generic_file_aio_write+0x88/0x100 Dec 12 05:29:46 somepoorbox kernel: [30883839.165423]  [] ? do_sync_write+0xf2/0x140 Dec 12 05:29:46 somepoorbox kernel: [30883839.165432]  [] ? sys_getpeername+0xd4/0xf0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165436]  [] ? vfs_write+0xb8/0x1a0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165441]  [] ? fget_light_pos+0x16/0x50 Dec 12 05:29:46 somepoorbox kernel: [30883839.165445]  [] ? sys_write+0x51/0xb0 Dec 12 05:29:46 somepoorbox kernel: [30883839.165450]  [] ? __audit_syscall_exit+0x25e/0x290 Dec 12 05:29:46 somepoorbox kernel: [30883839.165455]  [] ? system_call_fastpath+0x16/0x1b Dec 12 05:29:46 somepoorbox kernel: [30883839.165459] ---[ end trace 32aa3e2dc89d4c30 ]--- Dec 12 05:29:46 somepoorbox kernel: [30883839.165462] Tainting kernel with flag 0x9   

Tags:

doesn, predict, disk, errors, kernel, output, quot, pending, uncorrectable, sectors, reallocated, repeatedly, logs, assessment, current_pending_sector, attribute, unusable, eg, panics, raid, taints, smartctl, dev, sda, seagate, barracuda, es, ns, abcaeaaa, lu, wwn, firmware, bkh, user, capacity, bytes, gb, sector, database, ata, specification, draft, indicated, thu, dec, est, capability, enabled, overall, attribute_name, thresh, updated, when_failed, raw_value, raw_read_error_rate, spin_up_time, start_stop_count, old_age, reallocated_sector_ct, seek_error_rate, power_on_hours, spin_retry_count, power_cycle_count, reported_uncorrect, high_fly_writes, airflow_temperature_cel, min, temperature_celsius, hardware_ecc_recovered, offline_uncorrectable, offline, udma_crc_error_count, multi_zone_error_rate, data_address_mark_errs, var, somepoorbox, hostbyte, did_ok, driverbyte, driver_sense, medium, descriptor, descriptors, hex, unrecovered, auto, reallocate, cdb, end_request, drbd, uptodate, io, __req_mod, detaching, helper, sbin, drbdadm, pri, incon, degr, buffer, ext, fs, _journal_dirty_data, aborting, transaction, _orphan_add, readonly, filesystem, ffffffff, warn_slowpath_common, xe, superblock, ffffffffa, _get_group_desc, xa, jbd, metadata, blocknr, corruption, warn_slowpath_null, mark_buffer_dirty, ab, _commit_super, clone, ff, _handle_error, __ext, _std_error, xb, ebbf, xbf, dc, journal_dirty_data_fn, _ordered_write_end, generic_file_buffered_write_iter, __generic_file_write_iter, __generic_file_aio_write, generic_file_aio_write, do_sync_write, xf, sys_getpeername, xd, vfs_write, fget_light_pos, sys_write, eee, __audit_syscall_exit, system_call_fastpath, aa, tainting,

Latest Articles

  • Ubuntu Debian Linux Cannot Install Wine Solution - wine1.6 : Depends: wine1.6-i386 (= 1:1.6.2-0ubuntu14.2) but it is not installable wine1.4 : Depends: wine1.6 but it is not going to be installed
  • How To Install python 3.4 3.5 and up on Linux with wine - Working Solution
  • using Xvfb on virtual remote ssh server to have X graphical programs work
  • ssh Received disconnect from port 22:2: Too many authentication failures
  • named bind errors - DNSKEY: unable to find a DNSKEY which verifies the DNSKEY RRset and also matches a trusted key for '.'
  • OpenVZ vs LXC DIR mode poor security in LXC
  • httpd: Syntax error on line 221 of /etc/httpd/conf/httpd.conf: Syntax error on line 6 of /etc/httpd/conf.d/php.conf: Cannot load modules/libphp5.so into server: /lib64/libresolv.so.2: symbol __h_errno, version GLIBC_PRIVATE not defined in file libc.s
  • Radeon R3 GPU on Debian Crashing
  • MySQL 5.7 on Debian and Ubuntu - How To Reset Root Password
  • SSH and sshfs timeout settings keepalive
  • Linux How To Add User To Additional Group
  • Howto Set Static IP on boot in initramfs for dropbear or other purposes NFS, Linux, Debian, Ubuntu, CentOS
  • Convert and install to LUKS Encrypted Drive Ubuntu 18.04 19.10 Linux Mint and Debian Based Linux
  • Debian and Netplan
  • CentOS 8 how to restart the network!
  • CentOS 8 how to convert to a bootable mdadm RAID software array
  • ADATA USB Thumb Drive Issues
  • KMODE EXCEPTION NOT HANDLED - QEMU/KVM Won't Boot Windows 2016 or 10 Image or Physical Machine
  • Linux Mint / Ubuntu / Debian Mate Disable Guest Session and Hide Usernames on Lightdm Login screen GUI
  • SSH How To Create Public/Private Key Pair and with a Larger Keysize than 2048 bits