Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suspend broken on Gazelle 16 because of Nvidia GPU #358

Open
JRDetwiler opened this issue Aug 27, 2022 · 4 comments
Open

Suspend broken on Gazelle 16 because of Nvidia GPU #358

JRDetwiler opened this issue Aug 27, 2022 · 4 comments

Comments

@JRDetwiler
Copy link

Distribution (run cat /etc/os-release): Arch Linux (5.19.4 / Nvidia driver 515.65.01)

Related Application and/or Package Version (run apt policy $PACKAGE NAME): Related to using xfce4-session-logout --suspend with the Whisker Menu.

Issue/Bug Description: The system wakes up immediately after suspending. journalctl -p 3 -b reported logs that looked like these.

Steps to reproduce (if you know): You should able to run the command above to trigger it.

Expected behavior: The machine doesn't immediately wake up after screen tears.

Other Notes: I ended up finding my fix here.

Known workaround:

I went into /etc/modprobe.d/system76-power.conf and modified this setting to be off instead of on.

options nvidia NVreg_PreserveVideoMemoryAllocations=0

Proposed fix:

I think something in src/graphics.rs needs to be updated, possibly the S3 exception where this setting is being applied. That's the extent of the debugging I'm ready to put into this though, I hope it helps someone else.

@agherzan
Copy link

agherzan commented Sep 28, 2022

Here are the relevant errors (oryx6 on my side):

[   83.615220] NVRM: GPU 0000:01:00.0: PreserveVideoMemoryAllocations module parameter is set. System Power Management attempted without driver procfs suspend interface. Please refer to the 'Configuring Power Management Support' section in the driver README.
[   86.105347] nvidia 0000:01:00.0: PM: pci_pm_suspend(): nv_pmops_suspend+0x0/0x30 [nvidia] returns -5
[   86.106085] nvidia 0000:01:00.0: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -5
[   86.106102] nvidia 0000:01:00.0: PM: failed to suspend async: error -5
[   86.690666] PM: Some devices failed to suspend, or early wake event detected

@agherzan
Copy link

There is also a related known issue on the Nvidia side: https://download.nvidia.com/XFree86/Linux-x86_64/470.86/README/powermanagement.html#KnownIssuesAndWf438e

@regulator-g
Copy link

Just want to say I have the same issue, I think quite a few people do, thanks for the workaround its not perfect but laptop is no longer frozen on resume

@JRDetwiler
Copy link
Author

JRDetwiler commented Jan 27, 2023

Okay, here's the actual correct fix. Keep the gpu memory (/etc/modprobe.d/system76-power.conf):

options nvidia NVreg_PreserveVideoMemoryAllocations=1

Instead, enable these NVIDIA services which are disabled by default. I rebooted and it's finally working as expected with no side effects.

sudo systemctl enable nvidia-suspend.service
sudo systemctl enable nvidia-hibernate.service

As noted in the Arch wiki and NVIDIA's documentation, this saves gpu memory to a tmpfs, in /tmp by default. If your /tmp is small, this might've been the real issue in the first place. The config can be updated to dump to a larger, faster filesystem, likely resolving that issue:

options nvidia NVreg_PreserveVideoMemoryAllocations=1 NVreg_TemporaryFilePath=/path/to/tmp-nvidia

If someone can turn this into a pull request, go ahead. Maybe the system76-power graphics nvidia command isn't enabling these necessary services, or possibly not checking that /tmp is sufficiently big enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants