high IO wait
424 root 39 19 1900 848 552 D 0.0 0.0 0:00.91 updatedb
root 424 0.0 0.0 1900 848 ? DN Mar11 0:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?autofs
root 2907 0.0 0.0 1768 492 ? DN Mar15 0:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?autofs
root 8691 0.0 0.0 1768 492 ? DN Mar12 0:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?autofs
root 10642 0.0 0.0 4004 696 pts/0 S+ 00:38 0:00 grep updatedb
root 14937 0.0 0.0 1768 492 ? DN Mar16 0:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?autofs
root 16697 0.0 0.0 1768 488 ? DN Mar13 0:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?autofs
root 18790 0.0 0.0 1768 496 ? DN Mar18 0:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?autofs
root 24783 0.0 0.0 1768 500 ? DN Mar14 0:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?autofs
root 24922 0.0 0.0 1768 492 ? DN Mar17 0:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?autofs
root 29655 0.0 0.0 1768 496 ? DN Mar19 0:00 /usr/bin/updatedb -f sysfs?rootfs?bdev?proc?cpuset?binfmt_misc?debugfs?sockfs?usbfs?pipefs?anon_inodefs?futexfs?tmpfs?inotifyfs?eventpollfs?devpts?ramfs?hugetlbfs?mqueue?rpc_pipefs?autofs
I noticed that updatedb was running for days and then this dmesg message:
(manually killing the process and also killall updatedb does not stop it or kill it)
INFO: task updatedb:424 blocked for more than 300 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
updatedb D F79C8CF0 6476 424 419 (NOTLB)
e740bdd4 00000086 c30045c0 000e2832 f7acea4c 00000001 f79c8cf0 c30f8800
000e2832 000f4240 556bd125 0024f991 f79c8e14 f7d14000 556bf321 0024f991
00000002 00000000 00000001 00000000 c0419058 f7d14000 c06ecd80 f7a0cdec
Call Trace:
[<c0419058>] __wake_up+0x2a/0x3d
[<c0625676>] io_schedule+0x3b/0x64
[<c0460ead>] sync_page+0x0/0x3b
[<c0460ee5>] sync_page+0x38/0x3b
[<c062578e>] __wait_on_bit_lock+0x2a/0x52
[<c0460e28>] __lock_page+0x52/0x59
[<c0436000>] wake_bit_function+0x0/0x3c
[<c04686c9>] truncate_inode_pages_range+0x20b/0x260
[<f8c698d3>] journal_stop+0x208/0x213 [jbd]
[<c0468727>] truncate_inode_pages+0x9/0xe
[<f8ca5a1e>] ext3_delete_inode+0x13/0xba [ext3]
[<f8ca5a0b>] ext3_delete_inode+0x0/0xba [ext3]
[<c0499c51>] generic_delete_inode+0x91/0xfe
[<c0499728>] iput+0x67/0x69
[<c04970dd>] d_kill+0x19/0x32
[<c049821d>] dput+0x19f/0x1ac
[<c04909c3>] sys_renameat+0x15f/0x1af
[<c04ebb8e>] _atomic_dec_and_lock+0x2a/0x44
[<c0453cf8>] audit_syscall_entry+0x160/0x192
[<c0490a24>] sys_rename+0x11/0x15
[<c06289e7>] syscall_call+0x7/0xb
=======================
kill -9 424
strace -p 424
Process 424 attached - interrupt to quit
attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted
==========================
Eventually the system crashed with references to ext3 and inodes etc..
After a reboot it did the following:
md: md0: raid array is not clean -- starting background reconstruction
raid1: raid set md0 active with 2 out of 2 mirrors
md: ... autorun DONE.
md: syncing RAID array md0
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
md: using 128k window, over a total of 511999424 blocks.
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: md0: orphan cleanup on readonly fs
ext3_orphan_cleanup: deleting unreferenced inode 97467990
ext3_orphan_cleanup: deleting unreferenced inode 76153118
EXT3-fs: md0: 2 orphan inodes deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
linux, iowait, updatedb, mdadmhigh, io, dn, mar, usr, bin, sysfs, rootfs, bdev, proc, cpuset, binfmt_misc, debugfs, sockfs, usbfs, pipefs, anon_inodefs, futexfs, tmpfs, inotifyfs, eventpollfs, devpts, ramfs, hugetlbfs, mqueue, rpc_pipefs, autofs, pts, grep, dmesg, manually, killall, info, task, blocked, quot, echo, sys, kernel, hung_task_timeout_secs, disables, cf, notlb, bdd, acea, bd, bf, ecd, cdec, __wake_up, io_schedule, ead, sync_page, ee, __wait_on_bit_lock, __lock_page, wake_bit_function, truncate_inode_pages_range, journal_stop, jbd, truncate_inode_pages, xe, ca, ext, _delete_inode, xba, generic_delete_inode, xfe, iput, dd, d_kill, dput, ac, sys_renameat, af, ebb, _atomic_dec_and_lock, audit_syscall_entry, sys_rename, syscall_call, xb, strace, attach, ptrace, ptrace_attach, permitted, references, inodes, etc, reboot, md, raid, array, reconstruction, active, mirrors, autorun, syncing, autodetecting, arrays, _guaranteed_, kb, disc, maximum, idle, bandwidth, fs, readonly, filesystem, enabled, kjournald, interval, orphan, cleanup, _orphan_cleanup, deleting, unreferenced, inode, deleted, mounted, mode,