zfs and LUKS how to recover in Linux

Sometimes users take their removal drives and unplug and replug them to test what happens during the failure of a disk.  However, this breaks things quite badly due to the /dev/mapper in LUKS not coming back online due to it not being closed.

In other words, generally with non-encrypted drives the process is smooth but when encrypted you may want to follow a strategy like this:

We can see below that both disks are unavailable as they were physically removed from the server.

zpool status

  pool: rttpool
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://zfsonlinux.org/msg/ZFS-8000-HC
  scan: none requested
config:

    NAME            STATE     READ WRITE CKSUM
    rttpool         UNAVAIL      0     0     0  insufficient replicas
      mirror-0      UNAVAIL      0     0     0  insufficient replicas
        zpool-sdj1  FAULTED      0     0     0  corrupted data
        zpool-sdk1  FAULTED      0     0     0  corrupted data
errors: List of errors unavailable: pool I/O is currently suspended


Conventional wisdom says to clear the error after replugging the disks but does this work with LUKS?


root@rttbox:/home/rtt# zpool clear rttpool zpool-sdj1
cannot clear errors for zpool-sdj1: I/O error
root@rttbox:/home/rtt# zpool clear rttpool zpool-sdj
cannot clear errors for zpool-sdj: no such device in pool
root@rttbox:/home/rtt# zpool clear rttpool zpool-sdj1
cannot clear errors for zpool-sdj1: I/O error
root@rttbox:/home/rtt# zpool clear rttpool zpool-sdk1
cannot clear errors for zpool-sdk1: I/O error


As we can see, no it doesn't work.

Sometimes we may need to remove zpool.cache

#at your own risk do not try in production or not as a first resort just in case

rm /etc/zfs/zpool.cache

Now let's force clear the pool


 zpool clear -nF rttpool
root@rttbox:/home/rtt# zpool status
  pool: rttpool
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://zfsonlinux.org/msg/ZFS-8000-HC
  scan: none requested
config:

    NAME            STATE     READ WRITE CKSUM
    rttpool         UNAVAIL      0     0     0  insufficient replicas
      mirror-0      UNAVAIL      0     0     0  insufficient replicas
        zpool-sdj1  FAULTED      0     0     0  too many errors
        zpool-sdk1  FAULTED      0     0     0  too many errors
errors: List of errors unavailable: pool I/O is currently suspended

It still doesn't work as we can see.

How about clearing the device in the pool itself?


root@rttbox:/home/rtt# zpool clear -nF rttpool zpool-sdj1
root@rttbox:/home/rtt# zpool clear -nF rttpool zpool-sdk1


root@rttbox:/home/rtt# zpool online rttpool zpool-sdk1
cannot online zpool-sdk1: pool I/O is currently suspended
root@rttbox:/home/rtt# zpool online rttpool zpool-sdj1
cannot online zpool-sdj1: pool I/O is currently suspended

We can see that it still doesn't fix it.

The Actual Solution

Properly use cryptsetup to close and remove the zpool devices.

cryptsetup close zpool-sdj1
cryptsetup close zpool-sdk1

Now reopen the devices:

cryptsetup open /dev/sdj1 zpool-sdj1
cryptsetup open /dev/sdk1 zpool-sdk1


#then do cryptsetup open
zpool clear -nFX rttpool


now it works!


 zpool status
  pool: rttpool
 state: ONLINE
  scan: resilvered 160K in 0h0m with 0 errors on Thu Feb  1 23:01:59 2024
config:

    NAME            STATE     READ WRITE CKSUM
    rttpool         ONLINE       0     0     0
      mirror-0      ONLINE       0     0     0
        zpool-sdj1  ONLINE       0     0     0
        zpool-sdk1  ONLINE       0     0     0

errors: No known data errors

 

How To Recover After Moving ZFS Disks to new computer/server

If your disks are ready to go, zpool import will scan all disks looking for ZFS.

zpool import
   pool: rttpool
     id: 125324434212034535323
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

    rttpool         ONLINE
      mirror-0      ONLINE
        zpool-sdc1  ONLINE
        zpool-sdd1  ONLINE

 

Now just import it by the numeric ID 125324434212034535323 or the pool name rttpool

zpool import 125324434212034535323

 


Tags:

zfs, luks, linuxsometimes, users, removal, unplug, replug, disk, dev, mapper, online, generally, encrypted, disks, unavailable, server, zpool, rttpool, unavail, devices, faulted, io, failures, http, zfsonlinux, org, msg, hc, scan, requested, config, cksum, insufficient, replicas, sdj, corrupted, sdk, errors, currently, suspended, conventional, replugging, rttbox, rtt, doesn, cache, resort, rm, etc, nf, clearing, cryptsetup, reopen, nfx, resilvered, thu, feb,

Latest Articles

  • python mysql install error: /bin/sh: 1: mysql_config: not found /bin/sh: 1: mariadb_config: not found /bin/sh: 1: mysql_config: not found mysql_config --version
  • FreePBX 17 How To Add a Trunk
  • Docker Container Onboot Policy - How to make sure a container is always running
  • FreePBX 17 How To Add Phones / Extensions and Register
  • Warning: The driver descriptor says the physical block size is 2048 bytes, but Linux says it is 512 bytes. solution
  • Cisco How To Use a Third Party SIP Phone (eg. Avaya, 3CX)
  • Cisco Unified Communication Manager (CUCM) - How To Add Phones
  • pptp / pptpd not working in DD-WRT iptables / router
  • systemd-journald high memory usage solution
  • How to Install FreePBX 17 in Linux Debian Ubuntu Mint Guide
  • How To Install Cisco's CUCM (Cisco Unified Communication Manager) 12 Guide
  • Linux Ubuntu Redhat How To Extract Images from PDF
  • Linux and Windows Dual Boot Issue NIC Won't work After Booting Windows
  • Cisco CME How To Enable ACD hunt groups
  • How to install gns3 on Linux Ubuntu Mint
  • How to convert audio for Asterisk .wav format
  • Using Cisco CME Router with Asterisk as a dial-peer
  • Cisco CME How To Configure SIP Trunk VOIP
  • Virtualbox host Only Network Error Failed to save host network interface parameter - Cannot change gateway IP of host only network
  • Cisco CME and C7200 Router Testing and Learning Environment on Ubuntu 20+ Setup Tutorial Guide