Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WiP: Cryptsetup version bump reencryption cleanup (LUKS2 reencryption impossible otherwise on Q4.2 and others) #1541

Draft
wants to merge 58 commits into
base: master
Choose a base branch
from

Conversation

tlaurion
Copy link
Collaborator

@tlaurion tlaurion commented Nov 28, 2023

I finally got a grip on where stems the problem discussed under #1539

  • cryptsesup requires async/sync ops from kernel (removed directio ops)
    • cryptsetup uses direct-io when in offline mode without locking which is new calls changed in luks helper script
  • libaio is new strong dependency of lvm2 (hacking needed on lvm2 side to remove dep)
  • kernel AIO needed otherwise warning (lets see if that is verbose without being under debug mode later)
  • cryptsetup requires newer libdevmapper and dmsetup to deal also with that
  • those are provided by newer lvm2 binaries, including blkid and dmsetup itself
    • which required newer util-linux version to provide such

Todo:

  • Make sure kernel crypto backend requirements are as small as needed
  • Review patches and clean them
  • Removeinitrd/test_reencrypt_ram.sh when ramfs raw disk reencryption meets normal speed
  • test on real hardware
  • legacy boards now deprecated. refactoring of /etc/ash_functions can now occur to depend only on bash
  • documentation for deprecation of legacy board
  • open other issues, including newer distro impossibilities to having fs reencrypted if done through cryptsetup 2.6.1
  • cloudfare optimizations down from cryptsetup calls Choose stronger encryption by default and/or re-use encryption parameters of LUKS container #1539 (comment) still needed?

  • Disclose publicly needed firmware upgrade/kernel commit bump downstream (that having gone unnoticed confirms that noone reencrypted Q4.2/Q4.2.1 installation up to me discovering this. I can only repeat this, but OEM not pushing users to reencrypt their encrypted drives from OEM means that a malicious worker could backup LUKS header for all laptops where OSes are preinstalled, and sell that LUKS header backup at high prices for daily used laptops theft where OEM provisioned DRK passphrase can then be reused to access FDE content supposedly protected by encryption. The only way to completely transfer OEM ownership of a laptop to the end user is to rerun the Re-Ownership wizard and accept reencrypting the disk. DRK != DRK passphrase. Changing the DRK passphrase won't permit a LUKS header backup/restore from permitting old DRK passphrase from decrypting the FDE content.

@tlaurion
Copy link
Collaborator Author

tlaurion commented Nov 28, 2023

@UndeadDevel: here is my log from bf2891c's test_reencrypt_ram.sh output automatically dropped under USB Thumb drive:

ram_reencrypt.log

Now at least we have logs to share to follow the rabbit down into the rabbit hole. Will get back to this and analyse later.

Direct excerpt showing how this is suboptimal:

# /tmp/disk8gb.raw is not a block device. Can't open in exclusive mode.
# Reusing open ro fd on device /tmp/disk8gb.raw
# Reusing open ro fd on device /tmp/disk8gb.raw
# Could not initialize userspace block cipher and kernel fallback is disabled.
# Failed to initialize userspace block cipher.
# Using temporary dmcrypt to access data.
# Allocating a free loop device (block size: 4096).
# Trying to open device /dev/loop0 without direct-io.
# Attached loop device block size is 4096 bytes.
# Calculated device size is 16293896 sectors (RW), offset 32768.
# DM-UUID is CRYPT-TEMP-temporary-cryptsetup-1215-0
# dm create temporary-cryptsetup-1215-0 CRYPT-TEMP-temporary-cryptsetup-1215-0 [ opencount flush ]   [16384] (*1)
# dm reload   (253:0) [ opencount flush securedata ]   [16384] (*1)
# dm resume temporary-cryptsetup-1215-0  [ opencount flush securedata ]   [16384] (*1)
# temporary-cryptsetup-1215-0: Stacking NODE_ADD (253,0) 0:0 0600
# temporary-cryptsetup-1215-0: Stacking NODE_READ_AHEAD 256 (flags=1)
# temporary-cryptsetup-1215-0: Processing NODE_ADD (253,0) 0:0 0600
# Created /dev/mapper/temporary-cryptsetup-1215-0
# temporary-cryptsetup-1215-0: Processing NODE_READ_AHEAD 256 (flags=1)
# temporary-cryptsetup-1215-0 (253:0): read ahead is 256
# temporary-cryptsetup-1215-0: retaining kernel read ahead of 256 (requested 256)
# Old cipher storage wrapper type: 2.
# Reusing open rw fd on device /tmp/disk8gb.raw
# Could not initialize userspace block cipher and kernel fallback is disabled.
# Failed to initialize userspace block cipher.
# Using temporary dmcrypt to access data.
# Calculated device size is 16293896 sectors (RW), offset 32768.
# DM-UUID is CRYPT-TEMP-temporary-cryptsetup-1215-1
# dm create temporary-cryptsetup-1215-1 CRYPT-TEMP-temporary-cryptsetup-1215-1 [ opencount flush ]   [16384] (*1)
# dm reload   (253:1) [ opencount flush securedata ]   [16384] (*1)
# dm resume temporary-cryptsetup-1215-1  [ opencount flush securedata ]   [16384] (*1)
# temporary-cryptsetup-1215-1: Stacking NODE_ADD (253,1) 0:0 0600
# temporary-cryptsetup-1215-1: Stacking NODE_READ_AHEAD 256 (flags=1)
# temporary-cryptsetup-1215-1: Processing NODE_ADD (253,1) 0:0 0600
# Created /dev/mapper/temporary-cryptsetup-1215-1
# temporary-cryptsetup-1215-1: Processing NODE_READ_AHEAD 256 (flags=1)
# temporary-cryptsetup-1215-1 (253:1): read ahead is 256
# temporary-cryptsetup-1215-1: retaining kernel read ahead of 256 (requested 256)
# New cipher storage wrapper type: 2
Resuming LUKS reencryption in forced offline mode.
# Installing SIGINT/SIGTERM handler.
# Unblocking interruption on signal.
# LUKS2 requirements detected:
# online-reencrypt-v2 - known
# LUKS2 requirements detected:
# online-reencrypt-v2 - known
# Resuming LUKS2 reencryption.
# Progress 0, device_size 8342474752
# Calculating segments.
# Calculating hot segments (forward direction).
# Calculating post segments (forward direction).
# Setting 'hot' segments.
# Segment 0 assigned to digest 1.
# Segment 0 assigned to digest 0.
# Segment 1 assigned to digest 0.
# Segment 2 assigned to digest 0.
# Segment 3 assigned to digest 1.
# Reencrypting chunk starting at offset: 0, size :1073741824.
# LUKS2 requirements detected:
# online-reencrypt-v2 - known
# data_offset: 16777216
# Allocating buffer for storing resilience checksums.
# Checksums hotzone resilience.
# Going to store 8388608 bytes in reencrypt keyslot.
# Reencrypt keyslot 2 store.
# Reusing open rw fd on device /tmp/disk8gb.raw
# LUKS2 requirements detected:
# online-reencrypt-v2 - known
# LUKS2 requirements detected:
# online-reencrypt-v2 - known
# Device size 8359251968, offset 16777216.
# Trying to write LUKS2 header (16384 bytes) at offset 0.
# Reusing open rw fd on device /tmp/disk8gb.raw
# Checksum:fb73f8bc2a92da33ea5e4f73f51afdec9bbb49d60bc69adebd2766f08a89258d (in-memory)
# Trying to write LUKS2 header (16384 bytes) at offset 16384.
# Reusing open rw fd on device /tmp/disk8gb.raw
# Checksum:d19a0c0ee6472baa03b6b04eee2b5103d3b7b07e42b86f6682622d5dec2a8758 (in-memory)
# Setting 'post' segments.
# Segment 0 assigned to digest 1.
# Segment 1 assigned to digest 0.
# Segment 2 assigned to digest 0.
# Segment 3 assigned to digest 1.
# LUKS2 requirements detected:
# online-reencrypt-v2 - known
# LUKS2 requirements detected:
# online-reencrypt-v2 - known
# Device size 8359251968, offset 16777216.
# Trying to write LUKS2 header (16384 bytes) at offset 0.
# Reusing open rw fd on device /tmp/disk8gb.raw

@tlaurion tlaurion force-pushed the cryptsetup_version_bump-reencryption_cleanup branch 2 times, most recently from 8048f06 to 4cb56d4 Compare November 28, 2023 20:38
@tlaurion
Copy link
Collaborator Author

Of course, kernel dependencies exploding and newer non-optimized libs/binaries break legacy boards compatibility, as seen in CircleIC checks here.

Will attack that later but same as before on block devices and same as for ram.raw disk in prior tests:

# Could not initialize userspace block cipher and kernel fallback is disabled.
# Failed to initialize userspace block cipher.

Even though

# Verifying key from keyslot 1, digest 1.
# Checking if cipher aes-xts-plain64 is usable (storage wrapper).
# Initializing reencryption (mode: reencrypt) in LUKS2 metadata.

Definitely missing something here. Will sleep on it.

@tlaurion
Copy link
Collaborator Author

Not sure what to do with the output of the block device reencryption log from 4cb56d4
block_reencrypt.log

@tlaurion tlaurion force-pushed the cryptsetup_version_bump-reencryption_cleanup branch from e110960 to 302452d Compare November 29, 2023 06:55
@tlaurion
Copy link
Collaborator Author

tlaurion commented Nov 29, 2023

@UndeadDevel had an insight on checking more closely resilience options. Under 2.7.0RC0 (unreleased: tarballs don't contain configure or other files we patch for cross-compiling, will need to wait anyway I do not really want RC0 of something) resilience speed can be augmented but for 2.6.1 --resilence=none brings us to ~normal speeds. To me this is around 110 MiB/s where my lower performance bar tolerance when reencryptpion was added under Heads, I twaked on multiple SSDs to substain 100 MiB/s under most of the accessible SSDs I found around, where chinese low end (Shame on you FANXIANG S101 provided by cheap refubishers that exist out there) were 37 MiB/s. I restate, no. We cannot afford having users reencrypt Heads flashed devices bought from 3rd party and have criticisms of that kind (Went though that already: no thanks).

Test satisfying under 302452d

TODO next:

  • cleanup of kernel Kconfig options added vs master to reduce to bare minimal. Still to convinced adding suboptimal options even for older hardware which would benefit of xts-aes-plain64 anyway. Would add my veto here once again but would be willing to let Addendium or whatever its called if legacy boards still can build (yes. There are 3rd gen I3 out there. Once again, been there done that on x230 so not wanting to go there again.)

The issue reported above is still a mystery though:

># Could not initialize userspace block cipher and kernel fallback is disabled.
># Failed to initialize userspace block cipher.

Edit:
With --resilience=none, the issue of locking simply vanishes altogether and the errors about ciphers as well.

@UndeadDevel
Copy link
Contributor

Just after your post 2.7.0-rc0 has been announced as released; the changelog is mostly about OPAL support, but the following bits also seem interesting for the issue at hand:

Support OpenSSL 3.2 Argon2 implementation.

  Argon2 is now available directly in OpenSSL, so the code no longer
  needs to use libargon implementation.
  Configure script should detect this automatically.

* Add support for Argon2 from libgcrypt
  (requires yet unreleased gcrypt 1.11).

  Argon2 has been available since version 1.10, but we need version 1.11,
  which will allow empty passwords.

* Link only libcrypto from OpenSSL.

  This reduces dependencies as other OpenSSL libraries are not needed.

Regarding resilience: it seems that in offline mode interrupting a --resilience=none reencryption is not necessarily unsafe, so if it really gives such huge performance boost (I'm surprised) then I guess it makes sense. This also further justifies modifying the warning message as I talked about in the issue thread (PR coming in few days hopefully).

@tlaurion
Copy link
Collaborator Author

tlaurion commented Nov 29, 2023

Regarding resilience: it seems that in offline mode interrupting a --resilience=none reencryption is not necessarily unsafe, so if it really gives such huge performance boost (I'm surprised) then I guess it makes sense.

--resilience=none removes hotzone concept under reencryption. This means that there is no block checksuming nor resuming reencryption possible in case of power failure. That means that instead of locking/creating hotzone block at end of luks container, checksuming it and moving encrypted new block, cryptsetup does it inline but prevents interrupted reencryption from being resumable. This kinda doubles the speed.

Other notes from last commit:

--resilience=none : No hotzone created at end of container to write checksum value nor reopening of luks between block rewrite. No possibility of resuming interrupted reencrypt
--disable-locks : skips aditional lock + check, assumes exclusive mode
--force-offline-reencrypt : don't even try online mode
--perf-no_read_workqueue : bypass async AIO read pooling (Thanks Cloudflare patchset)
--perf-no_write_workqueue: bypass async AIO write pooling (Thanks Cloudflare patchset)
--perf-submit_from_crypt_cpus : use single cpu crypt op to limit cache miss. Forces dmsetup compatible implementation

Weirdly enough, and that's ok again under Heads kernel, we do not have CONFIG_LIBAIO. No async buffer support under kernel. Still, cryptsetup requires libaio in newer version. Even weirder, we now make sure to not even use async IO now by forcing --perf-no_write_workqueue --perf-no_read_workqueue.

@UndeadDevel :

Just after your post 2.7.0-rc0 has been announced as released; the changelog is mostly about OPAL support, but the following bits also seem interesting for the issue at hand:

Support OpenSSL 3.2 Argon2 implementation.

  Argon2 is now available directly in OpenSSL, so the code no longer
  needs to use libargon implementation.
  Configure script should detect this automatically.

* Add support for Argon2 from libgcrypt
  (requires yet unreleased gcrypt 1.11).

  Argon2 has been available since version 1.10, but we need version 1.11,
  which will allow empty passwords.

* Link only libcrypto from OpenSSL.

  This reduces dependencies as other OpenSSL libraries are not needed.

Heads is a limitated beast on those regards.

  • modules/openssl is not included nor compiled unless we have TPM2 on board for TPM2 toolstack requirements. So we cannot imply to have libcrypto
  • Once again, we configure and compile cryptsetup to use the kernel crypto backend.
  • We have libgcrypt, but as prior comment, we do not use it since kernel crypto backend is better at enforcing faster crypto backend taking adventage of cpu enhancements better (SSE2,AVX etc) for crytpo algos we use.
  • We use internal Argon2 support.
  • Opal support will be nice in the future. But I tend to get away of pushing for RC0 versions unless required. Also libgrcypt 1.11 is not released yet and going that direction would mean version bumping all gnupg toolstack.

I will post test results soon.

@tlaurion
Copy link
Collaborator Author

tlaurion commented Nov 29, 2023

And yet again, reencrypting a real installation vs reencrypting a cryptsetup 2.6.1 luksFormat block device gives a lot of variation and requires more testing here again. Just stating --resilience=none shows "Setting hot segment" so not sure what is happening here. Doc says no checksuming, but it still seems that the region is written at end of LUKS container and then moved back in current reencrypted block instead of being in place? behavior different from #1541 (comment) where there is no

# LUKS2 requirements detected:
# online-reencrypt-v2 - known
# data_offset: 16777216
# Allocating buffer for storing resilience checksums.
# Checksums hotzone resilience.
# Going to store 8388608 bytes in reencrypt keyslot.
# Reencrypt keyslot 2 store.
# Reusing open rw fd on device /dev/sda2
# LUKS2 requirements detected:
# online-reencrypt-v2 - known
# LUKS2 requirements detected:
# online-reencrypt-v2 - known
# Device size 248984371200, offset 16777216.

Anymore
That was obtained from cryptsetup reencrypt /dev/sda2 --disable-locks --force-offline-reencrypt --debug --key-file /tmp/passphrase.txt. So --resilience=none changed speed from 50MiB/s there to 105MiB/s.

For example, in my tests right now, if I recreate container with luksFormat and then reencrypt it:

  • reencrypt with --debug --resilience=none alone: Stable 116 MiB/s

If I reencrypt LUKS container created by stage 1 Q 4.2 RC5 installer:

  • reencrypt with --debug --resilience=none alone: becoming stable at around ~105 MiB/s

Get home message: how the LUKS container was created changes the reencryption speed.
And we need to make sure that the options we select optimizes for speed vs resilience. Heads is single user recovery environment and should permit users to reencrypt drive as part of OEM Rw-Ownership as fast as possible with current constraits.

Also noted that kernel config has not be reviewed here. Just injected Kconfig "best" practices but not for embedded environments.

@tlaurion
Copy link
Collaborator Author

tlaurion commented Dec 8, 2023

Todo:

  • add --perf* tweaks passed down to dmsetup to bypass kernel queues and write in a compliant way.

  • Then retest without the dmsetup compliant way but still with perf tweaks for read/write buffer bypass.

  • remove most kernel drivers added that aren't related.

Keep in mind that reencrypt tests need to be applied on top of an oo created and filled luks container otherwise reenncryption speed tests varies and are not meaningful to compare between them. Stay on same SSD drive and same platform to compare result

@tlaurion
Copy link
Collaborator Author

tlaurion commented Dec 9, 2023

Poor SSD drive having to deal and exhaust all cells through wear prevention. This drive will have its livespan reduced by a lot will try to get health report later on.... Will post logs related to previous commits here, which test script was made to output results for specific commits. Also tests were made against q4.2 reinstall made before commit f08ea9a

As of now, it seems that all --perf* options are irrelevant as well as ram consomption, since switching to ramfs from tmpfs (tmpfs prevents ram to be filled above 8gb) has no impact at all, unless having reencrypted once q4.2 install is changing alignment or something, speed are steady at 109.x MiB/s with latest 92dcd67 which was not significantly better than 3847158 :

block_reencrypt_f08ea9ad7af49dbdd218362555634018649e7349.log
block_reencrypt_fd4ac5c8a2accd5158429360b5f4da621ca4f45c.log
block_reencrypt_50b18a5e25c96caaac317e74ab98cce0bb727a5b.log
block_reencrypt_384715841bee04ea683a98ee4b54b36591f5ce9b.log
block_reencrypt_92dcd67e331194c0081c304a1dd382f54523360e.log

find ./ -name "block_reencrypt_*.log" | while read filename; do echo "$filename"; grep Finished "$filename"; done
./block_reencrypt_50b18a5e25c96caaac317e74ab98cce0bb727a5b.log
Finished, time 36m25s,  231 GiB written, speed 108.6 MiB/s
./block_reencrypt_fd4ac5c8a2accd5158429360b5f4da621ca4f45c.log
./block_reencrypt_all_optimiz.log
Finished, time 36m14s,  231 GiB written, speed 109.2 MiB/s
./block_reencrypt_.log
./block_reencrypt_no_lock-offline-no_resilience.log
Finished, time 34m04s,  231 GiB written, speed 116.1 MiB/s
./block_reencrypt_384715841bee04ea683a98ee4b54b36591f5ce9b.log
Finished, time 36m14s,  231 GiB written, speed 109.2 MiB/s
./block_reencrypt_f08ea9ad7af49dbdd218362555634018649e7349.log
./block_reencrypt_92dcd67e331194c0081c304a1dd382f54523360e.log
Finished, time 36m10s,  231 GiB written, speed 109.4 MiB/s

@tlaurion
Copy link
Collaborator Author

tlaurion commented Dec 9, 2023

okok... so dmsetup is used when online mode (if --force-offline-reencrypt is not specified). Will inspects logs more...
Reinstalling q4.2 rc5 to use --force-offline-reencrypt as base

@tlaurion
Copy link
Collaborator Author

tlaurion commented Dec 9, 2023

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
cryptsetup2 2.6.1 is a new release that supports reencryption of Q4.2 release
LUKS2 volumes. This is a critical feature for the Qubes OS 4.2 release.

cryptsetup 2.6.1 requires lvm2 2.03.23, which is also included in this PR.
lvm2 in turn requires libaio, which is also included in this PR.
util-linux 2.39 is also included in this PR and a dependency of lvm2.
patches for reproducible builds are included for all packages.
luks-functions is updated to support the new cryptsetup2 version calls
 reencryption happen in direct-io, offline mode and without locking.
  from tests, this is best for performance and reliability in single-user mode

TODO:
- async (AIO) calls are not used. direct-io is used instead. libaio could be hacked out
  - this could be subject to future work
- time to deprecated legacy boards the do not enough space for the new space requirements
  - x230-legacy, x230-legacy-flash, x230-hotp-legacy
  - t430-legacy, t430-legacy-flash, t430-hotp-legacy already deprecated

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
The x230-hotp-legacy, x230-legacy-flash, and x230-legacy boards are
officially deprecated.  They have been moved to the unmaintained_boards
directory.

CircleCI has been updated to reflect this change.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
@tlaurion tlaurion force-pushed the cryptsetup_version_bump-reencryption_cleanup branch from 20e884d to 2ea3195 Compare April 7, 2024 16:56
@tlaurion tlaurion changed the title WiP: Cryptsetup version bump reencryption cleanup (LUKS2 reencryption speed disastrous otherwise) WiP: Cryptsetup version bump reencryption cleanup (LUKS2 reencryption impossible otherwise on Q4.2 and others) Apr 7, 2024
@tlaurion tlaurion marked this pull request as ready for review April 7, 2024 17:07
@tlaurion
Copy link
Collaborator Author

tlaurion commented Apr 7, 2024

@UndeadDevel took me time to return to this, but retesting oem-factory-reset options on top of released Q4.2 release and now 4.2.1 release(to be done) made me aware that Heads was not able to reencrypt with previous options, being totally unaware of newer argon2 there and just failed there :/. Success at luksChangeKey though.

So I prioritized this over the weekend, revisited and cleaned commit trail and OP. Lost track of linux 5.10.10+ containing fixes for encryption speed from cloudfare merged upstream and retesting Q4.2 install now and will reapply reencryption from oem-factory-reset now default option.

Can you review the commit messages trail and current state of this PR to see if it matches the content of past discussions? I think it does, but landing your brain wouldn't hurt here. Ping me out please if you do.

@tlaurion
Copy link
Collaborator Author

tlaurion commented Apr 8, 2024

Testing on real hardware postponed, only get hands on crappy chinese SSD that slowsdown after writing 4gb of data (fills SSD buffer at full speed) and then slows down to 32Mb/s.... Will backup stuff and reuse good SSD to test this one.

If anyone beats me to it, it should all be fine where Talos II cannot benefit from it because we use branched older 5.5-power kernel with patches not upstreamed to Linux. Again, we have no case of Talos II being sold by OEMs.

Important part is all resellers laptops/workstations known should permit reencryption on received shipment through re-ownership, as documented whys under the oem-factory-reset script itself.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
@tlaurion tlaurion self-assigned this May 7, 2024
… card

QEMU TCG is not so good at getting exclusive access, so assigning SUB device to testing qube needs to be done AFTER kernel modules are loaded otherwise race codition between host and qemu.

Otherwise error -32, requiring to kill sys-usb and restart testing qube, and let the first attempt which loads drivers to fail prior of assigning USB Security dongle so that drivers are loaded.

Makes testing through QEMU TCG (not KVM which is better at getting exclusive USB device access) a little bit more usable (helps me keep sanity in development cycles)

---

@JonathonHall-Purism I could do PR seperately for this against master if you agree.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: Thierry Laurion <insurgo@riseup.net>
@tlaurion
Copy link
Collaborator Author

tlaurion commented May 17, 2024

@UndeadDevel really sorry about the delay fixing this, back onto it and I think i'm nearly done fixing things and proposing changes. Will push changes here soon when tests are completed against qemu under nix against luksv1/luksv2 container. I was nearly done but had to context switch and now getting back from context switch (which takes a while to get back to where I was unfortunately)

I saw #1678, reviewed quick (thanks and sorry you had to do this, we deduplicated brain power), I think everything is under current PR as of now but might have missed something. Will push code changes soon, staging under https://github.com/linuxboot/heads/compare/master...tlaurion:cryptsetup_version_bump-reencryption_cleanup-staging2?expand=1

…h from linuxboot#1661 (less and less required but still some). Cannot remove 5.10.5 because kgpe-d16 uses it.

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…included for builds (affects tpm2 boards builds)

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…onsole_for_internal_hack' into cryptsetup_version_bump-reencryption_cleanup-staging2
…onsole_for_internal_hack' into cryptsetup_version_bump-reencryption_cleanup-staging2

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: Thierry Laurion <insurgo@riseup.net>
@tlaurion
Copy link
Collaborator Author

tlaurion commented May 17, 2024

@UndeadDevel works on luks v1 and v2 but commit trail needs cleanup and the whole code needs my own review

Edit: works only on single luks drive setup. Not yet done.

@tlaurion tlaurion marked this pull request as draft May 17, 2024 21:38
@UndeadDevel
Copy link
Contributor

UndeadDevel commented May 18, 2024

Thanks, @tlaurion ; had a quick glance (not a review) at the changes and one thing I noticed is that the code asserts "7" to be the last LUKS1 slot, when it's actually "8" (as per the LUKS1 spec...LUKS1 key slots start at 1 and are 8 in total, while LUKS2 key slots start at 0 and are 32 in total - thereby 31 is the last LUKS2 key slot as correctly noted in your updated code). Edit: nevermind, I just tested this with a LUKS1 file volume created by cryptsetup 2.6.1 and at least in that configuration LUKS1 key slots start at 0, not 1, so the spec is confusing in this regard and your code correctly identifies key slots 7 and 31 as the last slots for LUKS1 and LUKS2, respectively.

Another issue is that you seem to have included one of my commits from #1646 twice, thus in effect fixing the wrong PIN (Admin PIN, not User PIN)...I did make a bit of a mess with that PR, so mea culpa. I think if you revert this commit it should be fixed properly.

@tlaurion
Copy link
Collaborator Author

tlaurion commented May 18, 2024

Not yet done. It doesn't play nicely over qubes 4.2.1 installed with BTRFS (two luks containers) on which I quickly tested this morning, causing regression in handing multiple luks containers and also promoting twice for drk which should be one and failing if unlocking other containers with dark fail, as well as wiping the key slots for all luks devices, not just the first encountered one.

Another round of commits will happen before cleaning up.

@UndeadDevel thanks for the heads up.

Caching of DUK should happen but doesn't so: two prompts for DRK
wiping only occurs on first LUKS

TODO fix and revert changes unneaded in this commit, context switching

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…e LUKS containers scenario), cleanup keyslot-> key slot everywhere

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
@tlaurion
Copy link
Collaborator Author

tlaurion commented May 24, 2024

Everything functional. Cleanup tomorrow. Damn things are big sometimes.

Just simulated Q4.2.1 2xLUKSv2 containers (BTRFS deployment) under Qemu: works.

Will test on real hardware but should be the same exact thing: LUKS passphrase expected to unlock multiple LUKS devices configured for setup; if one fails user is guided to change LUKS container passphrase (which exists independently through Options menu).

All seems good.

…ify_and_save_oldconfig_in_place helper

Signed-off-by: Thierry Laurion <insurgo@riseup.net>
@tlaurion
Copy link
Collaborator Author

tlaurion commented May 24, 2024

Ok. Will try to clean up the commit trail now since boards build (this PR bumps Linux kernel version which will break other UNMAINTAINED boards a little bit more, which is what happens when things are untested promptly as needed).

@tlaurion
Copy link
Collaborator Author

tlaurion commented May 24, 2024

Nope. Not today. Priorities changed once again

@tlaurion
Copy link
Collaborator Author

luks reencryption still requires a loop to do all detected luks containers (otherwise user needs to do other container from menu) on btrfs q4.2.1 alternative disk partition scheme.

Weird that cryptsetup accepts /dev/sda2 /dev/sda3 on command line but only does the first, normal I guess.

@tlaurion
Copy link
Collaborator Author

tlaurion commented May 26, 2024

Also missing, when renewing DUK, host needs to reboot, otherwise typing TPM DUK results in PCR unsealing errors (normal, since we just loaded usb drivers on non-hotp boards, which aren't part of expected TPM DUK unseal op.

Todo:

  • add loop in oem-fatory-reset or luks renncryption function to make sure that 2 luks partition scheme setups are reencrypted/passphrase changed with same passphrase
  • force reboot after DUK renewal in TOTP/HOTP reset/resealing path to not throw scary users to the user for non-hotp/usb keyboard enabled configurations which don't expect usb drivers to be loaded when unsealing TPM DUK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Choose stronger encryption by default and/or re-use encryption parameters of LUKS container
4 participants