-
-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WiP: Cryptsetup version bump reencryption cleanup (LUKS2 reencryption impossible otherwise on Q4.2 and others) #1541
base: master
Are you sure you want to change the base?
Conversation
@UndeadDevel: here is my log from bf2891c's Now at least we have logs to share to follow the rabbit down into the rabbit hole. Will get back to this and analyse later. Direct excerpt showing how this is suboptimal:
|
8048f06
to
4cb56d4
Compare
Of course, kernel dependencies exploding and newer non-optimized libs/binaries break legacy boards compatibility, as seen in CircleIC checks here. Will attack that later but same as before on block devices and same as for ram.raw disk in prior tests:
Even though
Definitely missing something here. Will sleep on it. |
Not sure what to do with the output of the block device reencryption log from 4cb56d4 |
e110960
to
302452d
Compare
@UndeadDevel had an insight on checking more closely resilience options. Under 2.7.0RC0 (unreleased: tarballs don't contain configure or other files we patch for cross-compiling, will need to wait anyway I do not really want RC0 of something) resilience speed can be augmented but for 2.6.1 Test satisfying under 302452d TODO next:
Edit: |
Just after your post 2.7.0-rc0 has been announced as released; the changelog is mostly about OPAL support, but the following bits also seem interesting for the issue at hand:
Regarding resilience: it seems that in offline mode interrupting a |
Other notes from last commit:
Weirdly enough, and that's ok again under Heads kernel, we do not have CONFIG_LIBAIO. No async buffer support under kernel. Still, cryptsetup requires libaio in newer version. Even weirder, we now make sure to not even use async IO now by forcing
Heads is a limitated beast on those regards.
I will post test results soon. |
And yet again, reencrypting a real installation vs reencrypting a cryptsetup 2.6.1 luksFormat block device gives a lot of variation and requires more testing here again. Just stating --resilience=none shows "Setting hot segment" so not sure what is happening here. Doc says no checksuming, but it still seems that the region is written at end of LUKS container and then moved back in current reencrypted block instead of being in place? behavior different from #1541 (comment) where there is no
Anymore For example, in my tests right now, if I recreate container with luksFormat and then reencrypt it:
If I reencrypt LUKS container created by stage 1 Q 4.2 RC5 installer:
Get home message: how the LUKS container was created changes the reencryption speed. Also noted that kernel config has not be reviewed here. Just injected Kconfig "best" practices but not for embedded environments. |
Todo:
Keep in mind that reencrypt tests need to be applied on top of an oo created and filled luks container otherwise reenncryption speed tests varies and are not meaningful to compare between them. Stay on same SSD drive and same platform to compare result |
Poor SSD drive having to deal and exhaust all cells through wear prevention. This drive will have its livespan reduced by a lot will try to get health report later on.... Will post logs related to previous commits here, which test script was made to output results for specific commits. Also tests were made against q4.2 reinstall made before commit f08ea9a As of now, it seems that all --perf* options are irrelevant as well as ram consomption, since switching to ramfs from tmpfs (tmpfs prevents ram to be filled above 8gb) has no impact at all, unless having reencrypted once q4.2 install is changing alignment or something, speed are steady at 109.x MiB/s with latest 92dcd67 which was not significantly better than 3847158 : block_reencrypt_f08ea9ad7af49dbdd218362555634018649e7349.log
|
okok... so dmsetup is used when online mode (if --force-offline-reencrypt is not specified). Will inspects logs more... |
9d0458b
to
63ad6f9
Compare
b982f79
to
20e884d
Compare
Signed-off-by: Thierry Laurion <insurgo@riseup.net>
cryptsetup2 2.6.1 is a new release that supports reencryption of Q4.2 release LUKS2 volumes. This is a critical feature for the Qubes OS 4.2 release. cryptsetup 2.6.1 requires lvm2 2.03.23, which is also included in this PR. lvm2 in turn requires libaio, which is also included in this PR. util-linux 2.39 is also included in this PR and a dependency of lvm2. patches for reproducible builds are included for all packages. luks-functions is updated to support the new cryptsetup2 version calls reencryption happen in direct-io, offline mode and without locking. from tests, this is best for performance and reliability in single-user mode TODO: - async (AIO) calls are not used. direct-io is used instead. libaio could be hacked out - this could be subject to future work - time to deprecated legacy boards the do not enough space for the new space requirements - x230-legacy, x230-legacy-flash, x230-hotp-legacy - t430-legacy, t430-legacy-flash, t430-hotp-legacy already deprecated Signed-off-by: Thierry Laurion <insurgo@riseup.net>
The x230-hotp-legacy, x230-legacy-flash, and x230-legacy boards are officially deprecated. They have been moved to the unmaintained_boards directory. CircleCI has been updated to reflect this change. Signed-off-by: Thierry Laurion <insurgo@riseup.net>
20e884d
to
2ea3195
Compare
@UndeadDevel took me time to return to this, but retesting oem-factory-reset options on top of released Q4.2 release and now 4.2.1 release(to be done) made me aware that Heads was not able to reencrypt with previous options, being totally unaware of newer argon2 there and just failed there :/. Success at luksChangeKey though. So I prioritized this over the weekend, revisited and cleaned commit trail and OP. Lost track of linux 5.10.10+ containing fixes for encryption speed from cloudfare merged upstream and retesting Q4.2 install now and will reapply reencryption from oem-factory-reset now default option. Can you review the commit messages trail and current state of this PR to see if it matches the content of past discussions? I think it does, but landing your brain wouldn't hurt here. Ping me out please if you do. |
Testing on real hardware postponed, only get hands on crappy chinese SSD that slowsdown after writing 4gb of data (fills SSD buffer at full speed) and then slows down to 32Mb/s.... Will backup stuff and reuse good SSD to test this one. If anyone beats me to it, it should all be fine where Talos II cannot benefit from it because we use branched older 5.5-power kernel with patches not upstreamed to Linux. Again, we have no case of Talos II being sold by OEMs. Important part is all resellers laptops/workstations known should permit reencryption on received shipment through re-ownership, as documented whys under the oem-factory-reset script itself. |
Signed-off-by: Thierry Laurion <insurgo@riseup.net>
… card QEMU TCG is not so good at getting exclusive access, so assigning SUB device to testing qube needs to be done AFTER kernel modules are loaded otherwise race codition between host and qemu. Otherwise error -32, requiring to kill sys-usb and restart testing qube, and let the first attempt which loads drivers to fail prior of assigning USB Security dongle so that drivers are loaded. Makes testing through QEMU TCG (not KVM which is better at getting exclusive USB device access) a little bit more usable (helps me keep sanity in development cycles) --- @JonathonHall-Purism I could do PR seperately for this against master if you agree. Signed-off-by: Thierry Laurion <insurgo@riseup.net>
@UndeadDevel really sorry about the delay fixing this, back onto it and I think i'm nearly done fixing things and proposing changes. Will push changes here soon when tests are completed against qemu under nix against luksv1/luksv2 container. I was nearly done but had to context switch and now getting back from context switch (which takes a while to get back to where I was unfortunately) I saw #1678, reviewed quick (thanks and sorry you had to do this, we deduplicated brain power), I think everything is under current PR as of now but might have missed something. Will push code changes soon, staging under https://github.com/linuxboot/heads/compare/master...tlaurion:cryptsetup_version_bump-reencryption_cleanup-staging2?expand=1 |
…h from linuxboot#1661 (less and less required but still some). Cannot remove 5.10.5 because kgpe-d16 uses it. Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…included for builds (affects tpm2 boards builds) Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…onsole_for_internal_hack' into cryptsetup_version_bump-reencryption_cleanup-staging2
…ion_bump-reencryption_cleanup-staging2
…ree/cryptsetup_version_bump-reencryption_cleanup-staging - LUKSv1 works Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…onsole_for_internal_hack' into cryptsetup_version_bump-reencryption_cleanup-staging2 Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Signed-off-by: Thierry Laurion <insurgo@riseup.net>
@UndeadDevel works on luks v1 and v2 but commit trail needs cleanup and the whole code needs my own review Edit: works only on single luks drive setup. Not yet done. |
Thanks, @tlaurion ; Another issue is that you seem to have included one of my commits from #1646 twice, thus in effect fixing the wrong PIN (Admin PIN, not User PIN)...I did make a bit of a mess with that PR, so mea culpa. I think if you revert this commit it should be fixed properly. |
Not yet done. It doesn't play nicely over qubes 4.2.1 installed with BTRFS (two luks containers) on which I quickly tested this morning, causing regression in handing multiple luks containers and also promoting twice for drk which should be one and failing if unlocking other containers with dark fail, as well as wiping the key slots for all luks devices, not just the first encountered one. Another round of commits will happen before cleaning up. @UndeadDevel thanks for the heads up. |
Caching of DUK should happen but doesn't so: two prompts for DRK wiping only occurs on first LUKS TODO fix and revert changes unneaded in this commit, context switching Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…ion_bump-reencryption_cleanup
Signed-off-by: Thierry Laurion <insurgo@riseup.net>
…e LUKS containers scenario), cleanup keyslot-> key slot everywhere Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Everything functional. Cleanup tomorrow. Damn things are big sometimes. Just simulated Q4.2.1 2xLUKSv2 containers (BTRFS deployment) under Qemu: works. Will test on real hardware but should be the same exact thing: LUKS passphrase expected to unlock multiple LUKS devices configured for setup; if one fails user is guided to change LUKS container passphrase (which exists independently through Options menu). All seems good. |
…ify_and_save_oldconfig_in_place helper Signed-off-by: Thierry Laurion <insurgo@riseup.net>
Ok. Will try to clean up the commit trail now since boards build (this PR bumps Linux kernel version which will break other UNMAINTAINED boards a little bit more, which is what happens when things are untested promptly as needed). |
Nope. Not today. Priorities changed once again |
luks reencryption still requires a loop to do all detected luks containers (otherwise user needs to do other container from menu) on btrfs q4.2.1 alternative disk partition scheme. Weird that cryptsetup accepts /dev/sda2 /dev/sda3 on command line but only does the first, normal I guess. |
Also missing, when renewing DUK, host needs to reboot, otherwise typing TPM DUK results in PCR unsealing errors (normal, since we just loaded usb drivers on non-hotp boards, which aren't part of expected TPM DUK unseal op. Todo:
|
I finally got a grip on where stems the problem discussed under #1539
cryptsesup requires async/sync ops from kernel (removed directio ops)kernel AIO needed otherwise warning (lets see if that is verbose without being under debug mode later)Todo:
initrd/test_reencrypt_ram.sh
when ramfs raw disk reencryption meets normal speed