Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: lvcreate error 5 when sending #184

Closed
tasket opened this issue Apr 13, 2024 · 46 comments
Closed

Bug: lvcreate error 5 when sending #184

tasket opened this issue Apr 13, 2024 · 46 comments
Labels
bug Something isn't working question Further information is requested
Milestone

Comments

@tasket
Copy link
Owner

tasket commented Apr 13, 2024

Wyng version 0.8beta 20240411 on Qubes 4.x

When preparing lvm snapshots for a full scan, the lvm lvcreate command exits with rc 5.

See report in forum message: https://forum.qubes-os.org/t/re-ann-wyng-incremental-backup-new-version/25801

Troubleshooting

Try adding -w debug to the wyng-util-qubes command line and then posting the error log contents: sudo less /tmp/wyng-debug/err.log. It should be possible to copy/paste the text by clicking on the clipboard widget ("Copy dom0 clipboard").

@kennethrrosen
Copy link

kennethrrosen commented Apr 13, 2024

 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dvm-crypt', 'set -e && export LC_ALL=C\nmkdir -p /tmp/wyngrpc']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dvm-crypt', '/bin/mkdir -p /tmp/wyngrpc/; /bin/cat >/tmp/wyngrpc/tmpsonrgto5']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dvm-crypt', '/usr/bin/ssh -x -o ControlPath=~/.ssh/controlsocket-%r@%h-%p -o ControlMaster=auto -o ControlPersist=60 -o ServerAliveInterval=30 -o ConnectTimeout=30 -o Compression=no MY@SERVER "$(/bin/cat /tmp/wyngrpc/tmpsonrgto5)"'], ['/bin/cat', '-v'], ['/usr/bin/tail', '--bytes=2000']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dvm-crypt', '/bin/cat >/tmp/wyngrpc/tmpbhx0ewy_']]
 --+--
[['/usr/bin/chattr', '+c', 'Vol_d260cd']]
 --+--
[['/usr/bin/chattr', '+c', 'Vol_d260cd']]
 --+--
[['/usr/sbin/lvm', 'vgdisplay', 'qubes_dom0/vm-pool']]
  Invalid volume group name qubes_dom0/vm-pool.
  Run `vgdisplay --help' for more information.
[0, 3]
 --+--
[['/usr/sbin/lvm', 'lvs', '--units=b', '--noheadings', '--separator=:::', '--options=vg_name,lv_name,lv_attr,lv_size,lv_time,pool_lv,thin_id,tags']]
 --+--
[['/sbin/dmsetup', 'message', 'qubes_dom0-root--pool-tpool', '0', 'release_metadata_snap']]
device-mapper: message ioctl on qubes_dom0-root--pool-tpool  failed: Invalid argument
Command failed.
 --+--
[['/sbin/dmsetup', 'message', 'qubes_dom0-vm--pool-tpool', '0', 'release_metadata_snap']]
device-mapper: message ioctl on qubes_dom0-vm--pool-tpool  failed: Invalid argument
Command failed.
 --+--
[['/usr/sbin/lvm', 'lvcreate', '-kn', '-ay', '-pr', '--addtag=wyng', '--addtag=arch-bc474c57-e258-4b31-8628-ddcfbc657c47', '-s', '/dev/qubes_dom0//vm-debian-12-minimal-private', '-n', 'vm-debian-12-minimal-private.tock']]
  Logical Volume "vm-debian-12-minimal-private.tock" already exists in volume group "qubes_dom0"
[0, 5]

Re the last line; the .tock volume is neither in the new archive or in /dev/qubes_dom0

@kennethrrosen
Copy link

kennethrrosen commented Apr 13, 2024

Investigating further:

[user@dom0 ~]$ sudo vgscan --mknodes
  Found volume group "qubes_dom0" using metadata type lvm2
  The link /dev/qubes_dom0/vm-debian-12-minimal-private.tick should have been created by udev but it was not found. Falling back to direct link creation.
  The link /dev/qubes_dom0/vm-debian-12-minimal-root.tick should have been created by udev but it was not found. Falling back to direct link creation.
  The link /dev/qubes_dom0/vm-fedora-39-xfce-private.tick should have been created by udev but it was not found. Falling back to direct link creation.
  The link /dev/qubes_dom0/vm-fedora-39-xfce-root.tick should have been created by udev but it was not found. Falling back to direct link creation.
  The link /dev/qubes_dom0/vm-tpl-browser-private.tick should have been created by udev but it was not found. Falling back to direct link creation.
  The link /dev/qubes_dom0/vm-tpl-browser-root.tick should have been created by udev but it was not found. Falling back to direct link creation.
  The link /dev/qubes_dom0/vm-tpl-crypt-private.tick should have been created by udev but it was not found. Falling back to direct link creation.
  The link /dev/qubes_dom0/vm-tpl-crypt-root.tick should have been created by udev but it was not found. Falling back to direct link creation.
  The link /dev/qubes_dom0/vm-debian-12-minimal-private.tock should have been created by udev but it was not found. Falling back to direct link creation.
  The link /dev/qubes_dom0/vm-debian-12-minimal-root.tock should have been created by udev but it was not found. Falling back to direct link creation.
  Command failed with status code 5.

After pvscan and vgscan and lvscan then sudo vgchange -ay yields 45 logical volume(s) in volume group "qubes_dom0" now active

sudo systemctl restart systemd-udevd.service 
 sudo udevadm control --reload
sudo udevadm trigger

Then:

ls /dev/qubes_dom0 shows the .tick and .tock files that weren't shown previously in the pool.

So I run sudo lvcreate -kn -ay -pr --addtag=wyng --addtag=arch-bc474c57-e258-4b31-8628-ddcfbc657c47 -s /dev/qubes_dom0/vm-debian-12-minimal-private -n vm-debian-12-minimal-private.tock

  Making thin LV vm-debian-12-minimal-private.tock in pool vm-pool in VG qubes_dom0 using segtype thin.
  Logical Volume "vm-debian-12-minimal-private.tock" already exists in volume group "qubes_dom0"

Then I try:

[user@dom0 ~]$ sudo lvs /dev/qubes_dom0/vm-debian-12-minima-private.tock
  Failed to find logical volume "qubes_dom0/vm-debian-12-minima-private.tock"

@kennethrrosen
Copy link

kennethrrosen commented Apr 13, 2024

I had to sudo lvremove /dev/qubes_dom0/vm-debian-12-minimal-private.tock. But this then leads me back to using wyng, which allows the use of --remap where wyng-util-qubes doesn't recognize the flag.

Encrypted archive 'qubes-ssh://dvm-crypt:MY@SERVER/home/qubes/wyng.baks' 
Last updated 2024-04-13 17:07:49.166869 (+02:00)

Preparing snapshots in '/dev/qubes_dom0/'...
  Preparing full scan of 'vm-debian-12-minimal-private'
  Skipping vm-debian-12-minimal-root; snapshot is from a different archive. Use --remap to clear it.
  Queuing full scan of import from '/tmp/wyng-util-qubes/qmeta.tgz' as 'wyng-qubes-metadata'

Sending backup session 20240413-215058:
————————————————————————————————————————————————
Traceback (most recent call last):imal-private
  File "/usr/bin/wyng", line 4716, in <module>
    monitor_send(storage, aset, selected_vols, monitor_only=False)
  File "/usr/bin/wyng", line 3504, in monitor_send
    send_volume(storage, vol, curtime, ses_tags, send_all=datavol in send_alls)
  File "/usr/bin/wyng", line 3272, in send_volume
    if exists((fpath := sdir+"-tmp/"+f)+".tmp"):   tarf_add(fpath+".tmp", arcname=fpath)
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/tarfile.py", line 2180, in add
    self.addfile(tarinfo, f)
  File "/usr/lib64/python3.11/tarfile.py", line 2203, in addfile
    self.fileobj.write(buf)
  File "/usr/lib64/python3.11/tarfile.py", line 441, in write
    self.__write(s)
  File "/usr/lib64/python3.11/tarfile.py", line 449, in __write
    self.fileobj.write(self.buf[:self.bufsize])
BrokenPipeError: [Errno 32] Broken pipe
Error on volume(s): vm-debian-12-minimal-root
Exception ignored in atexit callback: <function cleanup at 0x7c4566ca0900>
Traceback (most recent call last):
  File "/usr/bin/wyng", line 4504, in cleanup
    sys.exit(2)
SystemExit: 2
Exception ignored in: <function _Stream.__del__ at 0x7c45675df880>
Traceback (most recent call last):
  File "/usr/lib64/python3.11/tarfile.py", line 415, in __del__
    self.close()
  File "/usr/lib64/python3.11/tarfile.py", line 465, in close
    self.fileobj.write(self.buf)
BrokenPipeError: [Errno 32] Broken pipe
None
None

So once a qube is backed up into an archive through wyng-util-qubes it can't be backedup to another archive? And if that archive is deleted?

@kennethrrosen
Copy link

kennethrrosen commented Apr 13, 2024

After removing metadata files from /var/lib/wyng and re-initiating the remote archive:

wyng-util-qubes v0.8beta rel 20240411
Enter passphrase: 
Wyng 0.8beta3 release 20240403
Enter passphrase: 
Encrypted archive 'qubes-ssh://dvm-crypt:MY@SERVER:/home/qubes/wyng.baks' 
Last updated 2024-04-13 22:03:54.077835 (+02:00)

Preparing snapshots in '/dev/qubes_dom0/'...
  Preparing full scan of 'vm-debian-12-minimal-private'
  Skipping vm-debian-12-minimal-root; snapshot is from a different archive. Use --remap to clear it.
  Queuing full scan of import from '/tmp/wyng-util-qubes/qmeta.tgz' as 'wyng-qubes-metadata'

Sending backup session 20240413-220427:
————————————————————————————————————————————————
    0.0MB | 11s | vm-debian-12-minimal-private
    0.1MB |  3s | wyng-qubes-metadata
————————————————————————————————————————————————
 2 volumes, 2048——>0 MB in 19.6 seconds.
Error on volume(s): vm-debian-12-minimal-root
Exception ignored in atexit callback: <function cleanup at 0x77a380dfc900>
Traceback (most recent call last):
  File "/usr/bin/wyng", line 4504, in cleanup
    sys.exit(2)
SystemExit: 2

As a very separate sidenote, using wyng-util-qubes I cannot pipe the password file or include in passcmd= as there is still a second prompt for wyng (seen above)

@kennethrrosen
Copy link

kennethrrosen commented Apr 13, 2024

Last post, apologies; wanted to investigate before calling it a day. I duplicated debian-12-minimal, and into the same archive I created a backup.

Last updated 2024-04-13 22:10:54.735805 (+02:00)

Preparing snapshots in '/dev/qubes_dom0/'...
  Preparing full scan of 'vm-debian-12-test-private'
  Preparing full scan of 'vm-debian-12-test-root'
  Queuing full scan of import from '/tmp/wyng-util-qubes/qmeta.tgz' as 'wyng-qubes-metadata'

Sending backup session 20240413-222623:
—————————————————————————————————————————————
    0.0MB | 13s | vm-debian-12-test-private
  355.8MB |1m53s| vm-debian-12-test-root
    0.1MB |  7s | wyng-qubes-metadata
—————————————————————————————————————————————
 3 volumes, 22528——>355 MB in 137.0 seconds.

Perhaps this is user error, but some concerns I have that hopefully you might address:

  • Will new archives not welcome backups from older archives?
  • Will the restore function, in the absence of its namesake qube, restore the entirety of the qube? Say I have a fresh install and would like to restore a qube from a Wyng archive, will it restore entirely or should one create a dummy qubes in the same name and type (appvm, standalone) for it to inherit everything necessary for the qube to start in the way it was backed up?
  • Is there metadata that needs to be saved from dom0 for this backup/restore process to work on a new, reinstalled system?
  • And reiterating again, were one wanting to automate the backup or restore process, piping and passcmd= aren't working in wyng-util-qubes, meaning there will need to be user input for the second password.

As mentioned in our other thread, I really appreciate all your work on this and look forward to the full release but will keep working with it in the meanwhile and will help to improve however I can.

@tasket
Copy link
Owner Author

tasket commented Apr 14, 2024

@kennethrrosen OK, that's a lot of good feedback and investigation. Thanks!

For starters, I think Thin LVM had an internal hiccup and lost track of which volumes were available. Sometimes bringing the volgroup offline then online, or rebooting can resolve this. If there is an avoidable cause of the LVM problem, apart from Wyng snapshots on top of Qubes snapshots adding a bit more stress to Thin LVM, the answer to this is almost always to increase the qubes-dom0/vm-pool metadata size with sudo lvextend --poolmetadatasize (in the past, I had to shrink the swap lv a bit to make room in the vg). 3X as large as the original default is a good choice in my experience. Its also worth noting the Metadata check and repair section in man lvmthin. To move beyond these issues, I've been installing Qubes on Btrfs which has been a lot more stable in everyday use.

On remap: That can also be passed to Wyng as an option with -w. However, in general Wyng + LVM doesn't support multiple archives well, since LVM has such limited namespace and no subdirs. So yes, Wyng will prefer not to backup (LVM) volumes that are tracked by another archive unless you specify --remap; it does this partly because of LVM and partly because most of Wyng's speed advantages disappear when remapping occurs. (As an aside, deleting an archive or its metadata doesn't affect this as a 'foreign' snapshot will still exist; the easy way to remove it is with --remap.) Here also, Btrfs allows for more carefree use since a volume's snapshots can be tracked in multiple archives automatically.

On passphrase prompts: You can pass the --passcmd to Wyng with -w as well. A second prompt can be avoided with --authmin, although as noted in the forum future versions will have a minimal default to avoid repeated prompts for a single invocation of the util.

The 'atexit callback' has been fixed in the 08wip branch.

More in next comment...

@tasket
Copy link
Owner Author

tasket commented Apr 14, 2024

@kennethrrosen

Will new archives not welcome backups from older archives?

Should be addressed above. Wyng is a snapshot manager, and a snapshot must be associated with an archive (actually a point in time in an archive). But LVM is ill-suited to juggling multiple snapshots per volume. I find that most users never encounter remapping issues because they tend to use one archive per system.

Will the restore function, in the absence of its namesake qube, restore the entirety of the qube? Say I have a fresh install and would like to restore a qube from a Wyng archive, will it restore entirely or should one create a dummy qubes in the same name and type (appvm, standalone) for it to inherit everything necessary for the qube to start in the way it was backed up?

Yes, wyng-util-qubes will create qubes (I hate saying that; confusion with the OS name) AKA VMs as necessary and restore their settings. It stores the Qubes XML metadata in the archive and retrieves it much the same way Qubes' own qvm-backup-restore does. The util passes the XML to the Qubes API, which then provides a sort of backup-object-model (description) of the VMs to the util.

In rare cases, conflicts may arise because you currently have a special Standalone VM named 'sys-net' where the archive contains a regular appVM of that same name (the classification of the existing VM can't be changed). This would require manual intervention, such as renaming the Standalone VM to something else and re-doing the restore.

Is there metadata that needs to be saved from dom0 for this backup/restore process to work on a new, reinstalled system?

For non-dom0 VMs, nothing more than a bit of care is needed. However you need to supply/restore the templates and netVMs that those VMs rely on. IOW, its best to restore templates and netVMs (like sys-net) first if that's necessary (OTOH, the default installed templates and/or netVMs may be all that your other VMs need to function). The util will restore templates first to help avoid issues with this. But if your ducks aren't lined up at restore time then you may need to try more than once (this is another reason to be cautious with multiple archives... you may put necessary templates or netVMs in an archive that you forgot about).

I sometimes get asked about directly restoring dom0 itself. That is a can of worms (and not directly supported) but it doesn't mean you can't backup the dom0 root with Wyng (not the util) if its sitting in a thin LVM volume.

And reiterating again, were one wanting to automate the backup or restore process, piping and passcmd= aren't working in wyng-util-qubes, meaning there will need to be user input for the second password.

FWIW, I just tried successfully with the following pipe using Wyng rel 20240411:

echo abc123 | sudo ./wyng-util-qubes --dest=qubes://storage/home/user/test.wbak list

You were using Wyng rel 20240403 and probably didn't specify --authmin, but that Wyng version defaults authmin to '0'. The util needs to run Wyng multiple times for certain ops, so it uses the passphrase at first and then the next run sees no key agent so it prompts you instead. (So... use the newer Wyng release ;) )

This also worked:

export wpass=abc123
sudo ./wyng-util-qubes --dest=qubes://storage/home/user/test.wbak -w "passcmd=echo $wpass" list

@kennethrrosen
Copy link

kennethrrosen commented Apr 14, 2024

Thank you, @tasket!

I've been testing backups with throwaway VMs over the last few hours and am much less afraid that I was during that initial period where the lvm pool wigged out over a dearth of metadata space. For the script I've established to run wyng-util-qubes intermittently, I'm looking at the wyng.ini file for authmin and have a var for the passphrase now.

dom0 is out of the scope of what I was hoping to do. And I haven't yet run into issues with dependencies re templates or netvms; my use case is a very minimal system that's transportable and quickly restored in case of total loss. This has all been very helpful. I think we can consider this closed, but if anything further arises I'll place here or in the forum.

@kennethrrosen
Copy link

kennethrrosen commented Apr 15, 2024

On remap: That can also be passed to Wyng as an option with -w.

Considering the QubesOS default is LVM, these are the parameters I'm working under. @tasket, having now been using the tool properly for a few days, what I wondered about remapping, understanding that it threatens the inherent speed of the backups, was whether using remap often would be damaging to the snapshot or the archive itself.

In the documentation you note that it's possible to backup and backup: duplicate the archive in another location, remote or local, for redundancy's sake. Is that faster and more efficient than having wyng backup to multiple archives (one local, one remote)?

I apologize if all of this would simply be negated were I to reinstall Qubes under Brtfs; perhaps when I have a day to setaside in a month or two. Edit: also keeping an eye on this. QubesOS/qubes-issues#6476 (comment)

@kennethrrosen
Copy link

kennethrrosen commented Apr 22, 2024

With versions wyng-util-qubes v0.9wip rel 20240415 and Wyng 0.8wip release 20240415:

wyng-util-qubes v0.9wip rel 20240415
Wyng 0.8wip release 20240415

Traceback (most recent call last):
  File "/usr/bin/wyng", line 4702, in <module>
    aset        = get_configs(options)    ; dest = aset.dest
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 1994, in get_configs
    aset = get_configs_remote(dest, cachedir, opts)    ; os.utime(aset.path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 2066, in get_configs_remote
    return ArchiveSet(arch_dir, dest, opts, children=2, allvols=True, prior_auth=aset)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 221, in __init__
    ff = fetch_file_blobs([(sp+"/manifest.z", self.vols[dv].sessions[s].manifest_max())
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 2606, in fetch_file_blobs
    raise BufferError("Bad file size "+str(untrusted_size))
BufferError: Bad file size 104

@tasket
Copy link
Owner Author

tasket commented Apr 23, 2024

@kennethrrosen I've never encountered this error being triggered before. Its likely that something is mangling the data.

Just to be sure, try using the updated Wyng 08wip version I just posted. It will provide more explicit details.

@kennethrrosen
Copy link

wyng-util-qubes v0.8beta rel 20240411
Wyng 0.8wip release 20240421

Traceback (most recent call last):
  File "/usr/bin/wyng", line 4703, in <module>
    aset        = get_configs(options)    ; dest = aset.dest
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 1994, in get_configs
    aset = get_configs_remote(dest, cachedir, opts)    ; os.utime(aset.path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 2066, in get_configs_remote
    return ArchiveSet(arch_dir, dest, opts, children=2, allvols=True, prior_auth=aset)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 221, in __init__
    ff = fetch_file_blobs([(sp+"/manifest.z", self.vols[dv].sessions[s].manifest_max())
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 2606, in fetch_file_blobs
    raise BufferError(f"Bad file size {untrusted_size}, expected {fsz} max.\n"
BufferError: Bad file size 104, expected 72 max.
/var/lib/wyng/a_7d25dffa63f41e4e8799f7ebe45c0479b38a83aa/Vol_475bad/S_20240413-220427/manifest.z - False

@tasket
Copy link
Owner Author

tasket commented Apr 23, 2024

@kennethrrosen

  1. Did you do any pruning of that archive recently?
  2. What is the output for 'ls -l /var/lib/wyng/a_7d25dffa63f41e4e8799f7ebe45c0479b38a83aa/Vol_475bad/S_20240413-220427'
  3. If a 'manifest' (without '.z') exists in that dir, could you post the content?

@tasket
Copy link
Owner Author

tasket commented Apr 23, 2024

@kennethrrosen No need to answer the questions in the last comment. I can see a code path that would produce the error when a metadata file is especially small and also uncompressible. The size check doesn't have enough margin to allow for that.

I've added the necessary margin to avoid that error condition; see today's update to 08wip branch.

@tasket
Copy link
Owner Author

tasket commented Apr 23, 2024

Also, I was just able to reproduce the error (without the modification) by adding a very small 'volume' to the archive, then deleting the local metadata and then doing a verify.

@kennethrrosen
Copy link

Thank you. wip seems to work now, and am getting these alerts before the snapshots are sent:

Removed mis-matched snapshot for 'vm-fedora-39-xfce-private'
  Queuing full scan of 'vm-fedora-39-xfce-private'
  Removed mis-matched snapshot for 'vm-fedora-39-xfce-root'
  Queuing full scan of 'vm-fedora-39-xfce-root'

Was it something I may have done in the remote directory? Though, really, I've done nothing since the initial issue was resolved and had the backups running on a bi-weekly script.

@tasket
Copy link
Owner Author

tasket commented Apr 23, 2024

This could be triggered by deleting /var/lib/wyng metadata (although it shouldn't persist if you stop deleting it). Is it happening for all volumes?

I was experiencing a similar issue with LVM volumes recently (without deleting metadata); IIRC I applied a fix but I'll have to revisit that code to see what is going on. I would like to reach a point where full scans aren't usually necessary even when /var/lib/wyng is deleted.

@kennethrrosen
Copy link

I have not touched the metadata since it began backing up without issue; though it seems like it has selectively skipped the -root but not the -private volume for one of the VMs.

@tasket
Copy link
Owner Author

tasket commented Apr 23, 2024

It will always skip -root for appVMs, since those only borrow their root volumes. You would expect to see -root vols for template VMs.

tasket added a commit that referenced this issue Apr 25, 2024
Test volname uniqueness

Add prog version to json output

Update Readme
@kennethrrosen
Copy link

kennethrrosen commented Apr 26, 2024

The following occurs at the end of a backup. One Standalone in particular is now not backing up -- seems the process just hangs at the private volume backup.

Volume 'vm-XXX-private': Process Process-1:
Traceback (most recent call last):
  File "/usr/bin/wyng", line 4772, in <module>
    sid = monitor_send(storage, aset, vols or selected_vols, monitor_only=False, use_sesid=sid)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 3554, in monitor_send
    send_volume(storage, vol, curtime, ses_tags, send_all=datavol in send_alls)
  File "/usr/bin/wyng", line 3300, in send_volume
    send_chunk(tar_info, etag, buf)
  File "/usr/bin/wyng", line 3109, in send_chunk_remote
    tarf_addfile(tarinfo=tar_info, fileobj=fileobj)
  File "/usr/lib64/python3.11/tarfile.py", line 2208, in addfile
    copyfileobj(fileobj, self.fileobj, tarinfo.size, bufsize=bufsize)
  File "/usr/lib64/python3.11/tarfile.py", line 255, in copyfileobj
    dst.write(buf)
  File "/usr/lib64/python3.11/tarfile.py", line 441, in write
    self.__write(s)
  File "/usr/lib64/python3.11/tarfile.py", line 449, in __write
    self.fileobj.write(self.buf[:self.bufsize])
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: <function _Stream.__del__ at 0x7cb843897880>
Traceback (most recent call last):
  File "/usr/lib64/python3.11/tarfile.py", line 415, in __del__
    self.close()
  File "/usr/lib64/python3.11/tarfile.py", line 465, in close
    self.fileobj.write(self.buf)
BrokenPipeError: [Errno 32] Broken pipe
None
None
wyng-util-qubes v0.8beta rel 20240411

@tasket
Copy link
Owner Author

tasket commented Apr 26, 2024

@kennethrrosen Probably due to a communication error with remote? Unless the URL is of the file: or qubes: type.

Is the standalone backup attempt initial/full, or incremental?

You can run the util with -w debug to put Wyng into debug mode, then retrieve /tmp/wyng-debug/err.log in dom0 and also in the helper VM there should be a /tmp/wyng-rpc with a subdir that contains a log file.

@tasket
Copy link
Owner Author

tasket commented Apr 26, 2024

PS - If the dest URL is one of the 'ssh' types, then the wyng-rpc log will be on the remote system, not the VM.

@kennethrrosen
Copy link

@tasket it does seem to have been a network-related issue, as when I switched to a LAN connection it eventually finished; this was a full dedupe and the private volume had changed by 20010.5MB so it was a rather significant.

Separately, but something unaddressed from earlier: is it not advisable to --remap for every sync? For instance, when I'm traveling, with poor internet, it doesn't make sense to backup remotely, so I'll backup to an offline VM. When I reconnect to a stable connection, that VM then syncs to the remote server. But if I have two separate archives (one on the VM and one on the remote client) I'd have to remap each time. Is there a way to navigate this as yet?

@tasket
Copy link
Owner Author

tasket commented Apr 26, 2024

FWIW, there is no risk to using --remap, despite defeating Wyng's original reason for being. Archive integrity will stay the same with or without remap.

As for retaining a high degree of efficiency...

I'm assuming your two archives have somewhat different VM selections because of local space or other issues? If not, and the two have essentially the same VMs, you could do an rsync -aH --delete if the backup VM has access. If the VM has no access and it must stay that way, then by definition it requires some custom solution or a special mode in Wyng (which it currently doesn't have).

One thing I've done to keep my dom0 home & config backed up is a script that sends /home and /etc to an offline VM, using tar --keep-newer-files --index-file in the VM. Tar only overwrites what has changed and it logs it in the index file; I then compare the index with local find output and delete the difference. This could be modified to your purpose by starting the process with another find listing that gets fed into tar --files-from so that only changes are sent; it may be the best short-term option if you don't want to wait for remapping.

Here is the current script that would need adaptation:

#!/bin/sh
set -e

sudo mkdir -p /etc/backup-misc
sudo cp -u /var/lib/qubes/qubes.xml /etc/backup-misc

sudo tar -cf - /etc /home | qvm-run -u root -p root-backup 'mkdir -p /home/user/backup && cd /home/user/backup && tar --warning=no-ignore-newer --keep-newer-files --exclude=*/.cache/* --index-file=/tmp/tar_index -xvf -'
qvm-run -u root -p root-backup "find /home/user/backup -type f -printf '%P\n' | sort >/tmp/l-find"
qvm-run -p root-backup "grep -v '/\$' /tmp/tar_index |sort |comm -23 /tmp/l-find - >l-delete"
qvm-run -u root -p root-backup 'cd /home/user/backup && cat ../l-delete | xargs -r rm'

qvm-shutdown root-backup

Another idea is to simply do qvm-clone on VMs you wish to backup, since you are using the local storage anyway. There would be 'work', 'work01', 'work02', etc. VMs. Later backing them up with wyng-util-qubes would differ based on whether or not you want to preserve the history in the clones.

A related idea is to use Qubes' revisions_to_keep setting for VM volumes. I don't know how high this setting can go ('2' is the highest I've seen) but it might form the basis for easily replaying a VM through a history of changes while backing up each progression.

The latter two ideas of course put a strain on LVM metadata resources, and they don't provide compression. (I should probably repeat that this conundrum essentially goes away on Btrfs, where Wyng is simply keeping separate snapshots for each archive.)

@kennethrrosen
Copy link

kennethrrosen commented May 9, 2024

@tasket When I recently attempted to restore a single VM (no session) I receive the prompt "VMs selected [VMNAME]" and then wyng exits. (Response to your previous comment once I've sorted the restore.) As a separate piece of feedback, whenever I stop wyng I can't run it again without restarting dom0, because it says wyng is already running and rm -rf /var/lock/wyng is the only inelegant way of stopping wyng.

Here is the error log output:

[user@dom0 ~]$ sudo cat /tmp/wyng-debug/err.log
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dev', 'set -e && export LC_ALL=C\nmkdir -p /tmp/wyngrpc']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dev', '/usr/bin/mkdir -p /tmp/wyngrpc/; /usr/bin/cat >/tmp/wyngrpc/tmp_n8xu_2f']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dev', '/usr/bin/ssh -x -o ControlPath=~/.ssh/controlsocket-%r@%h-%p -o ControlMaster=auto -o ControlPersist=60 -o ServerAliveInterval=30 -o ConnectTimeout=30 -o Compression=no root@XXXXXXXXXXX "$(/usr/bin/cat /tmp/wyngrpc/tmp_n8xu_2f)"'], ['/usr/bin/cat', '-v'], ['/usr/bin/tail', '--bytes=2000']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dev', '/usr/bin/cat >/tmp/wyngrpc/tmpm9h5lp2k']]
 --+--
[['/usr/bin/chattr', '+c', 'Vol_7d805f', 'Vol_09506c', 'Vol_475bad', 'Vol_125b8f', 'Vol_d0fc3b', 'Vol_0af321', 'Vol_f22aea', 'Vol_128fc6', 'Vol_48f4af', 'Vol_26e350', 'Vol_7c8658', 'Vol_dfd944', 'Vol_31788a', 'Vol_996fda', 'Vol_289eec', 'Vol_78f57f', 'Vol_016952', 'Vol_633c73', 'Vol_355fc6', 'Vol_dc7862', 'Vol_f7a861', 'Vol_2e67de', 'Vol_5c7d94', 'Vol_eb4baf', 'Vol_f9e2a0']]
 --+--
[['/usr/bin/chattr', '+c', 'Vol_7d805f', 'Vol_09506c', 'Vol_475bad', 'Vol_125b8f', 'Vol_d0fc3b', 'Vol_0af321', 'Vol_f22aea', 'Vol_128fc6', 'Vol_48f4af', 'Vol_26e350', 'Vol_7c8658', 'Vol_dfd944', 'Vol_31788a', 'Vol_996fda', 'Vol_289eec', 'Vol_78f57f', 'Vol_016952', 'Vol_633c73', 'Vol_355fc6', 'Vol_dc7862', 'Vol_f7a861', 'Vol_2e67de', 'Vol_5c7d94', 'Vol_eb4baf', 'Vol_f9e2a0']]

@tasket
Copy link
Owner Author

tasket commented May 9, 2024

When I recently attempted to restore a single VM (no session) I receive the prompt "VMs selected [VMNAME]" and then wyng exits.

@kennethrrosen What does list [VMNAME] show?

@tasket
Copy link
Owner Author

tasket commented May 9, 2024

@kennethrrosen There may be a bug in the way the util is filtering root vs private volumes for each VM. What is the VM type, and if its template or standalone, what is the session overlap between the two lists (assuming you're using LVM storage):

wyng list vm-[VMNAME]-private vm-[VMNAME]-root

@kennethrrosen
Copy link

@tasket

@kennethrrosen What does list [VMNAME] show?

240415-025508  240426-180256  240430-102422  240506-124953
240415-074324  240427-120102  240501-120047  240507-120121
240415-121813  240428-120116  240505-120059  240507-145213
240416-173017  240429-120113  240506-120109  240508-055547

@kennethrrosen There may be a bug in the way the util is filtering root vs private volumes for each VM. What is the VM type, and if its template or standalone, what is the session overlap between the two lists (assuming you're using LVM storage):

This VM (I have not tried in this session to restore any others) is a disposable template AppVM.

Last updated 2024-05-08 09:09:36.231483 (+02:00)
Volume 'vm-crypt-root' not configured; Skipping.

Sessions for volume 'vm-crypt-private':

20240415-025508  20240417-120032  20240426-120309  20240429-120113
20240415-074324  20240423-190330  20240426-180256  20240430-102422
20240415-121813  20240425-163737  20240427-120102
20240416-173017  20240426-081940  20240428-120116

20240501-120047  20240505-120059  20240507-120121
20240503-151729  20240506-120109  20240507-145213
20240503-193235  20240506-124953  20240508-055547


Sessions for volume 'vm-crypt-root':

Traceback (most recent call last):
  File "/usr/bin/wyng", line 4834, in <module>
    show_list(aset, options.volumes)
  File "/usr/bin/wyng", line 4445, in show_list
    if not aset.vols[dv].sessions:   print("None.")    ; continue
           ~~~~~~~~~^^^^
KeyError: 'vm-crypt-root'

@kennethrrosen
Copy link

kennethrrosen commented May 10, 2024

@tasket if I do a fresh install (the reason for my testing the backups beforehand) and repartition with btrfs, will the wyng archive, or directories in dom0, need modification? I read in the forum that one would need to create a subvolume, like so:

qvm-shutdown --all --wait --force
sudo mv /var/lib/qubes /var/lib/qubes-old
sudo btrfs subvolume create /var/lib/qubes
shopt -s dotglob
sudo mv /var/lib/qubes-old/* /var/lib/qubes
sudo rmdir /var/lib/qubes-old

@tasket
Copy link
Owner Author

tasket commented May 10, 2024

@kennethrrosen I need to do more testing with disp templates; I've only ever backed these up once or twice. You can force the util to include them with the --include-disposable option.

I read in the forum that one would need to create a subvolume, like so

Yes, I need to put this in the Readme. The --local spec has to point directly to a subvolume (even if its the primary/default one). The reason is that subvol transaction activity must be checked just before and after CoW metadata is acquired.

Also, once you have restored to Btrfs, the volume names will no longer match the archive vol names. The util does not yet offer to rename the volumes (something I'm working on), so your best approach for resuming backup procedures in the same archive is to turn on deduplication; this will avoid re-sending all the current data to the archive. Of course, you could manually rename the volumes instead.

@kennethrrosen
Copy link

@tasket below are the commands I use in a script on my lvm machine, for backup and restore respectively:

sudo wyng-util-qubes backup $vms -u -d -w passcmd="echo $pass" -w authmin=10 --dest=$backup_dest
sudo wyng-util-qubes restore $vms -u -w passcmd="echo $pass" -w authmin=10 --session=$session_id --dest=$backup_dest

How would I modify these, assuming I'm first backing up from the lvm machine, and then restoring to a btrfs machine (assuming also that I've already completed the subvolume change)?

@tasket
Copy link
Owner Author

tasket commented May 10, 2024

@kennethrrosen The thing I'd recommend for restore is not to use --session along with a list of vm names, as --session will currently use a list of vol names based on the session's contents (specifying both will yield an error); leave out --session and let it choose the latest for each VM in your list. This is the only change I'd make.

Your backup routine already uses deduplication, so that's good. In a couple of days I should have the re-naming code done and tested, in which case dedup won't be necessary.

Also, I didn't mean to imply you should be using --local option yourself... this is now automatically handled by the util. But the reason for requiring a subvol relates to --local.

IIRC, when you use the Qubes Btrfs installer default it makes everything one big filesystem and the Qubes pool it creates will be the system default where newly created (restored) VMs will go. After you finish installing, look at the disk widget in the systray: you should see one entry for the kernels and one called 'varlibqubes' with most of the disk space you allocated. The util restores to the right pool automatically (i.e. the system default pool) unless you have a custom partition scheme, in which case you might have more than one Qubes pool (in my case I had to use qvm-pool to setup a pool for my VMs and then make it the system default; this is because I installed with multiple partitions/filesystems).

@kennethrrosen
Copy link

kennethrrosen commented May 10, 2024

@tasket I will make that change to the restore command. Thanks! Is --include-disposable strictly necessary if I am already explicitly backing up that disposable template VM? And would --include-disposable be necessary for restoration? (Edit: Later use of --include-disposable showed no change.)

The util restores to the right pool automatically

Then creating a subvolume isn't necessary if the util will manage it independenly?

If not, and the two have essentially the same VMs, you could do an rsync -aH --delete if the backup VM has access.

This might be the best solution, as my script already rsyncs from other vms to the local offline backup, and I likewise tar dom0's /home /etc and /srv directories. I very much appreciate you providing your script and also the continued assistance.

I'll await the re-naming code update before once more test the restore of the disposable template VM and migrating to btrfs.

@tasket
Copy link
Owner Author

tasket commented May 12, 2024

@kennethrrosen That --include-disposable was a stopgap. I'm creating a separate issue for supporting this VM type and the option can be deprecated.

Also – I think I just re-created the restore bug you were experiencing: The restore simply stops without reporting any details and exits silently with return code 2. This is a bug in Wyng not handling the --save-to option correctly (it thinks local storage is offline when it should be ignored in this case). This should be fixed tomorrow.

Then creating a subvolume isn't necessary if the util will manage it independenly?

That would be sort of like trying to re-create an LVM volgroup from a backup. So the answer is 'No'. However, I could see including a manually-invoked option for this somewhere. The Btrfs subvol is not a part of Qubes' concept of 'reflink storage pool' (although it might be someday) so having a Qubes pool defined doesn't bestow any subvols on us. I might in future allow operation without a specific subvol, if some people want the option of flying without it... however there are many situations where that would cause Wyng to stop and say "I can't process this fs metadata".

I'll await the re-naming code update before once more test the restore of the disposable template VM and migrating to btrfs.

I'll try to have it all better-tested tomorrow and hopefully ready for a new beta release mid-week.

@tasket
Copy link
Owner Author

tasket commented May 17, 2024

These issues all seem to be addressed so I'm closing this one out. Thanks for the feedback!

@kennethrrosen
Copy link

@tasket I've just got around to testing, and I get the following error before wyng exists. zstd is installed in dom0.

Wyng 0.8wip release 20240516

Traceback (most recent call last):
  File "/usr/bin/wyng", line 4812, in <module>
    aset        = get_configs(options)    ; dest = aset.dest
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 2031, in get_configs
    aset = get_configs_remote(dest, cachedir, opts)    ; os.utime(aset.path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 2050, in get_configs_remote
    aset = ArchiveSet(tmpdir, dest, opts, children=0, pass_agent=int(opts.authmin))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 179, in __init__
    self.compress    = compressors[self.compression][2]
                       ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
KeyError: 'zstd'

wyng-util-qubes v0.9beta rel 20240515

@kennethrrosen
Copy link

The same error appears when using the main branches, too:

Wyng 0.8beta release 20240515

Traceback (most recent call last):
  File "/usr/bin/wyng", line 4812, in <module>
    aset        = get_configs(options)    ; dest = aset.dest
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 2031, in get_configs
    aset = get_configs_remote(dest, cachedir, opts)    ; os.utime(aset.path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 2050, in get_configs_remote
    aset = ArchiveSet(tmpdir, dest, opts, children=0, pass_agent=int(opts.authmin))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 179, in __init__
    self.compress    = compressors[self.compression][2]
                       ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
KeyError: 'zstd'

wyng-util-qubes v0.9beta rel 20240515
[user@dom0 bin]$ 

@tasket
Copy link
Owner Author

tasket commented May 20, 2024

@kennethrrosen The package to install in dom0 is 'python3-zstd' like so:

sudo qubes-dom0-update python3-zstd

@tasket tasket added bug Something isn't working question Further information is requested labels May 20, 2024
@tasket tasket added this to the v0.8 milestone May 20, 2024
@tasket tasket closed this as completed May 21, 2024
@kennethrrosen
Copy link

@tasket, encountered on restore:

wyng-util-qubes v0.9beta rel 20240515

VMs matched in session 20240521-101854:
 comms

Restoring VM data volumes:
Wyng 0.8beta release 20240515
Encrypted archive 'qubes-ssh://dvm-crypt:root@XXXXXXXXX:/home/qubes/wyng.baks' 
Last updated 2024-05-21 10:27:45.833467 (-04:00)

Receiving volume 'vm-comms-private' 20240521-101854
Saving to tlvm pool '/dev/qubes_dom0/vm-comms-private'
UqWn0JFDJrlNjCmffyAj2n4t4gwk072xiMac5Qi3rSo= x0000000089de0000 S_20240415-025508

Traceback (most recent call last):
  File "/usr/bin/wyng", line 4916, in <module>
    count = receive_volume(storage, aset.vols[dv], select_ses=options.session or "",
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/bin/wyng", line 4260, in receive_volume
    raise BufferError("Got %d bytes, expected %d" % (len(untrusted_buf), size))
BufferError: Got 65529 bytes, expected 131108
Traceback (most recent call last):
  File "/usr/bin/wyng-util-qubes", line 385, in <module>
    raise subprocess.CalledProcessError(p.returncode, p.stderr)
subprocess.CalledProcessError: Command 'None' returned non-zero exit status 1.

@tasket
Copy link
Owner Author

tasket commented May 22, 2024

@kennethrrosen This is the immediate error:

BufferError: Got 65529 bytes, expected 131108

You could check to see if this data chunk really is 131108 bytes on disk. In order to find the path, just wyng list --verbose to see the volume's ID; that will be the volume's directory in the archive. Then do a find wyng.baks/Vol_xxxxx -name x0000000089de0000 -printf '%p %s\n' to see the size of that chunk.

If you run the restore with -w debug it will leave the error logs in place. Check /tmp/wyng-debug/err.log in dom0 and on the dest system /tmp/wyng-rpc/xxxxx/receive.log.

@tasket
Copy link
Owner Author

tasket commented May 22, 2024

@kennethrrosen I've updated the guidance for safely making duplicate archives. See https://github.com/tasket/wyng-backup#tips--caveats

@kennethrrosen
Copy link

@tasket I haven't made a duplicate archive in this case. It is still the lvm system pulling from the arch I inited from the outset. Very great to see the duplicate archive tips section; I will post the error log shortly

@kennethrrosen
Copy link

@tasket

You could check to see if this data chunk really is 131108 bytes on disk.

The data chunk is indeed this size. Here is the output from sudo less /tmp/wyng-debug/err.log:

 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dvm-crypt', 'set -e\nmkdir -p /tmp/wyngrpc']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dvm-crypt', '/usr/bin/mkdir -p /tmp/wyngrpc/; /usr/bin/cat >/tmp/wyngrpc/tmp8kzp3wrq']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dvm-crypt', '/usr/bin/ssh -x -o ControlPath=~/.ssh/ctrl-%C -o ControlMaster=auto -o ControlPersist=120 -o ServerAliveInterval=60 -o ConnectTimeout=60 -o Compression=no ssh://root@XXXXXXXXXX: "$(/usr/bin/cat /tmp/wyngrpc/tmp8kzp3wrq)"'], ['/usr/bin/cat', '-v'], ['/usr/bin/tail', '--bytes=2000']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dvm-crypt', '/usr/bin/cat >/tmp/wyngrpc/tmp1c_9r2nk']]
 --+--
[['/usr/bin/chattr', '+c', 'Vol_7d805f', 'Vol_09506c', 'Vol_475bad', 'Vol_125b8f', 'Vol_d0fc3b', 'Vol_0af321', 'Vol_f22aea', 'Vol_128fc6', 'Vol_48f4af', 'Vol_26e350', 'Vol_7c8658', 'Vol_dfd944', 'Vol_31788a', 'Vol_996fda', 'Vol_289eec', 'Vol_78f57f', 'Vol_016952', 'Vol_633c73', 'Vol_355fc6', 'Vol_dc7862', 'Vol_f7a861', 'Vol_2e67de', 'Vol_5c7d94', 'Vol_eb4baf', 'Vol_f9e2a0']]
 --+--
[['/usr/bin/chattr', '+c', 'Vol_7d805f', 'Vol_09506c', 'Vol_475bad', 'Vol_125b8f', 'Vol_d0fc3b', 'Vol_0af321', 'Vol_f22aea', 'Vol_128fc6', 'Vol_48f4af', 'Vol_26e350', 'Vol_7c8658', 'Vol_dfd944', 'Vol_31788a', 'Vol_996fda', 'Vol_289eec', 'Vol_78f57f', 'Vol_016952', 'Vol_633c73', 'Vol_355fc6', 'Vol_dc7862', 'Vol_f7a861', 'Vol_2e67de', 'Vol_5c7d94', 'Vol_eb4baf', 'Vol_f9e2a0']]
 --+--
[['/usr/sbin/lvm', 'vgdisplay', 'qubes_dom0/vm-pool']]
  Invalid volume group name qubes_dom0/vm-pool.
  Run `vgdisplay --help' for more information.
[0, 3]
 --+--
[['/usr/sbin/lvm', 'lvs', '--units=b', '--noheadings', '--separator=:::', '--options=vg_name,lv_name,lv_attr,lv_size,lv_time,pool_lv,thin_id,tags']]
 --+--
[['/sbin/dmsetup', 'message', 'qubes_dom0-vm--pool-tpool', '0', 'release_metadata_snap']]
device-mapper: message ioctl on qubes_dom0-vm--pool-tpool  failed: Invalid argument
Command failed.
 --+--
[['/sbin/dmsetup', 'message', 'qubes_dom0-root--pool-tpool', '0', 'release_metadata_snap']]
device-mapper: message ioctl on qubes_dom0-root--pool-tpool  failed: Invalid argument
Command failed.
 --+--
[['/usr/bin/xargs', '/usr/bin/sha256sum']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dvm-crypt', '/usr/bin/cat >/tmp/wyngrpc/tmp624fuaqz']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dvm-crypt', '/usr/bin/ssh -x -o ControlPath=~/.ssh/ctrl-%C -o ControlMaster=auto -o ControlPersist=120 -o ServerAliveInterval=60 -o ConnectTimeout=60 -o Compression=no ssh://root@XXXXXXXXXXX: "$(/usr/bin/cat /tmp/wyngrpc/tmp624fuaqz)"']]
 --+--
[['/usr/bin/cmp', '/tmp/wyng9kx05h4e/compare-hashes.local', '/tmp/wyng9kx05h4e/compare-hashes.dest']]
 --+--
[['/usr/bin/qvm-run', '--no-color-stderr', '--no-color-output', '-p', 'dvm-crypt', '/usr/bin/cat >/tmp/wyngrpc/tmp99ehxkgw']]
 --+--
[['/usr/bin/chattr', '+c', '/var/lib/wyng/a_7d25dffa63f41e4e8799f7ebe45c0479b38a83aa/Vol_48f4af/S_20240415-025508']]
 --+--
:

tasket added a commit that referenced this issue May 23, 2024
@tasket
Copy link
Owner Author

tasket commented May 23, 2024

@kennethrrosen Nothing to go on in that log. There could be something in the receive.log that's located on the remote system under /tmp/wyng-rpc/xxxx where xxxx is the latest tmp subdir (random). Running with -w debug should leave it in place.

The circumstances, receiving a partial message before terminating, suggest that something in the VM or remote environment could be putting ssh into a 'pty' mode which reacts to escape sequences (thus interpreting part of your data as an escape and closing the pipe). I pushed a test version in the debug branch with ssh options -T -e none added that should prevent pty mode, if that is the case. (I'm going to assume for now your dom0 env isn't altered so that qvm-run is filtering escape chars / using pty mode for pipes, as I've never seen this in 10+ years of using Qubes).

An avenue for further troubleshooting would be to make a copy of that chunk, then replace it with a uniform (say, all \x01 or all 'A's) chunk of exactly the same size. In that case, the size error probably wouldn't be triggered and you would see an invalid hash error instead.

@tasket
Copy link
Owner Author

tasket commented May 23, 2024

@kennethrrosen I forgot to mention there is another way to test and confirm/eliminate a qvm-run or ssh connection issue, which is to run Wyng on the remote system itself (pointing to a 'file:/' URL) with a command like sudo wyng verify volume-name --dest=file:/home/qubes/wyng.baks which has no requirement to target --local or save any data, so its easy to do.

@tasket
Copy link
Owner Author

tasket commented May 23, 2024

@kennethrrosen I sent you an email from my protonmail account.

tasket added a commit that referenced this issue May 23, 2024
Fix arch_check call with vols
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants