zfs and LUKS how to recover in Linux

Sometimes users take their removal drives and unplug and replug them to test what happens during the failure of a disk.  However, this breaks things quite badly due to the /dev/mapper in LUKS not coming back online due to it not being closed.

In other words, generally with non-encrypted drives the process is smooth but when encrypted you may want to follow a strategy like this:

We can see below that both disks are unavailable as they were physically removed from the server.

zpool status

  pool: rttpool
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://zfsonlinux.org/msg/ZFS-8000-HC
  scan: none requested
config:

    NAME            STATE     READ WRITE CKSUM
    rttpool         UNAVAIL      0     0     0  insufficient replicas
      mirror-0      UNAVAIL      0     0     0  insufficient replicas
        zpool-sdj1  FAULTED      0     0     0  corrupted data
        zpool-sdk1  FAULTED      0     0     0  corrupted data
errors: List of errors unavailable: pool I/O is currently suspended


Conventional wisdom says to clear the error after replugging the disks but does this work with LUKS?


root@rttbox:/home/rtt# zpool clear rttpool zpool-sdj1
cannot clear errors for zpool-sdj1: I/O error
root@rttbox:/home/rtt# zpool clear rttpool zpool-sdj
cannot clear errors for zpool-sdj: no such device in pool
root@rttbox:/home/rtt# zpool clear rttpool zpool-sdj1
cannot clear errors for zpool-sdj1: I/O error
root@rttbox:/home/rtt# zpool clear rttpool zpool-sdk1
cannot clear errors for zpool-sdk1: I/O error


As we can see, no it doesn't work.

Sometimes we may need to remove zpool.cache

#at your own risk do not try in production or not as a first resort just in case

rm /etc/zfs/zpool.cache

Now let's force clear the pool


 zpool clear -nF rttpool
root@rttbox:/home/rtt# zpool status
  pool: rttpool
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://zfsonlinux.org/msg/ZFS-8000-HC
  scan: none requested
config:

    NAME            STATE     READ WRITE CKSUM
    rttpool         UNAVAIL      0     0     0  insufficient replicas
      mirror-0      UNAVAIL      0     0     0  insufficient replicas
        zpool-sdj1  FAULTED      0     0     0  too many errors
        zpool-sdk1  FAULTED      0     0     0  too many errors
errors: List of errors unavailable: pool I/O is currently suspended

It still doesn't work as we can see.

How about clearing the device in the pool itself?


root@rttbox:/home/rtt# zpool clear -nF rttpool zpool-sdj1
root@rttbox:/home/rtt# zpool clear -nF rttpool zpool-sdk1


root@rttbox:/home/rtt# zpool online rttpool zpool-sdk1
cannot online zpool-sdk1: pool I/O is currently suspended
root@rttbox:/home/rtt# zpool online rttpool zpool-sdj1
cannot online zpool-sdj1: pool I/O is currently suspended

We can see that it still doesn't fix it.

The Actual Solution

Properly use cryptsetup to close and remove the zpool devices.

cryptsetup close zpool-sdj1
cryptsetup close zpool-sdk1

Now reopen the devices:

cryptsetup open /dev/sdj1 zpool-sdj1
cryptsetup open /dev/sdk1 zpool-sdk1


#then do cryptsetup open
zpool clear -nFX rttpool


now it works!


 zpool status
  pool: rttpool
 state: ONLINE
  scan: resilvered 160K in 0h0m with 0 errors on Thu Feb  1 23:01:59 2024
config:

    NAME            STATE     READ WRITE CKSUM
    rttpool         ONLINE       0     0     0
      mirror-0      ONLINE       0     0     0
        zpool-sdj1  ONLINE       0     0     0
        zpool-sdk1  ONLINE       0     0     0

errors: No known data errors

 

How To Recover After Moving ZFS Disks to new computer/server

If your disks are ready to go, zpool import will scan all disks looking for ZFS.

zpool import
   pool: rttpool
     id: 125324434212034535323
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

    rttpool         ONLINE
      mirror-0      ONLINE
        zpool-sdc1  ONLINE
        zpool-sdd1  ONLINE

 

Now just import it by the numeric ID 125324434212034535323 or the pool name rttpool

zpool import 125324434212034535323

 


Tags:

zfs, luks, linuxsometimes, users, removal, unplug, replug, disk, dev, mapper, online, generally, encrypted, disks, unavailable, server, zpool, rttpool, unavail, devices, faulted, io, failures, http, zfsonlinux, org, msg, hc, scan, requested, config, cksum, insufficient, replicas, sdj, corrupted, sdk, errors, currently, suspended, conventional, replugging, rttbox, rtt, doesn, cache, resort, rm, etc, nf, clearing, cryptsetup, reopen, nfx, resilvered, thu, feb,

Latest Articles

  • How to resize a pdf without losing much quality in Linux Mint Ubuntu Debian Redhat Solution
  • qemu: could not load PC BIOS 'bios-256k.bin' solution
  • Proxmox How To Custom Partition During Install
  • Hyper-V Linux VM Boots to Black Screen, Storage, NIC Not Found Issues
  • Ubuntu Mint How to Fix Missing/Broken /dev and /dev/pts which causes terminal to immediately close exit and not work
  • How high can a Xeon CPU get?
  • bash fix PATH environment variable "command not found" solution
  • Ubuntu Linux Mint Debian Redhat Youtube Cannot Play HD or 4K videos, dropped frames or high CPU usage with Nvidia or AMD Driver
  • hostapd example configuration for high speed AC on 5GHz using WPA2
  • hostapd how to enable and use WPS to connect wireless devices like printers
  • Dell Server Workstation iDRAC Dead after Firmware Update Solution R720, R320, R730
  • Cloned VM/Server/Computer in Linux won't boot and goes to initramfs busybox Solution
  • How To Add Windows 7 8 10 11 to GRUB Boot List Dual Booting
  • How to configure OpenDKIM on Linux with Postfix and setup bind zonefile
  • Debian Ubuntu 10/11/12 Linux how to get tftpd-hpa server setup tutorial
  • efibootmgr: option requires an argument -- 'd' efibootmgr version 15 grub-install.real: error: efibootmgr failed to register the boot entry: Operation not permitted.
  • Apache Error Won't start SSL Cert Issue Solution Unable to configure verify locations for client authentication SSL Library Error: 151441510 error:0906D066:PEM routines:PEM_read_bio:bad end line SSL Library Error: 185090057 error:0B084009:x509 certif
  • Linux Debian Mint Ubuntu Bridge br0 gets random IP
  • redis requirements
  • How to kill a docker swarm