WIP linux-drm-syncobj-v1 #411

ids1024 · 2024-04-09T22:21:08Z

I think the blocker logic should be correct for handling acquire points (if I properly understand the transaction system in Smithay). Though I don't see rendering issues with Mesa git when the blocker is removed... maybe it needs to be tested with something heavier than vkcube. (Or is there something still forcing implicit sync?).

The logic I added in Smithay for signaling releases may be a little less correct. Though maybe not more incorrect that how buffer releases are currently handled? (If I understand DRM correctly, with direct scanout we should make sure not to release until we've committed a new buffer and are sure the display controller won't want to read the buffer.)

We'll be able to test more when the next Nvidia driver is released. This at least gives us a way to test the explicit sync support they're adding.

Presumably we should test if drmSyncobjEventfd is supported... maybe just creating a syncobj and calling that to see if it works? I'm also still a little unsure how this ends up working with multiple GPUs... particularly if one is Nvidia.

Drakulix · 2024-04-10T09:57:44Z

I think the blocker logic should be correct for handling acquire points (if I properly understand the transaction system in Smithay). Though I don't see rendering issues with Mesa git when the blocker is removed... maybe it needs to be tested with something heavier than vkcube. (Or is there something still forcing implicit sync?).

For that you probably have to remove the old dmabuf.generate_blocker logic, which pulls a fence out of the dmabuf to do essentially the same thing. We basically should check, if the client uses explicit-sync and then use the acquire fence and otherwise fallback to polling the dmabuf directly.

The logic I added in Smithay for signaling releases may be a little less correct. Though maybe not more incorrect that how buffer releases are currently handled? (If I understand DRM correctly, with direct scanout we should make sure not to release until we've committed a new buffer and are sure the display controller won't want to read the buffer.)

Yeah, I am pretty sure smithay's code isn't correct in what it does today, but given direct-scanout mostly works, that is probably using implicit-sync in the background.

What needs to happen is storing the fence generated by compositing or in case of direct scanout, we need to get the OUT_FENCE_PTR property of the respective plane. Then we can wait for that fence, once the buffer is replaced and won't be used for new rendering/scanout operations and signal release once that fence is done. (Maybe there is even kernel api to "signal once this other fence is signalled"?)

Presumably we should test if drmSyncobjEventfd is supported... maybe just creating a syncobj and calling that to see if it works? I'm also still a little unsure how this ends up working with multiple GPUs... particularly if one is Nvidia.

Yeah, imo this needs a compile test similar to what we do for gbm in smithay: https://github.com/Smithay/smithay/blob/master/build.rs#L99-L125

So if the local kernel/drm headers of the system support it, we enable the feature and assume the kernel does as well. I don't know how well runtime detection would work, but we have to make sure to not advertise the global, if this function isn't supported.

ids1024 · 2024-04-10T15:25:37Z

For that you probably have to remove the old dmabuf.generate_blocker logic, which pulls a fence out of the dmabuf to do essentially the same thing. We basically should check, if the client uses explicit-sync and then use the acquire fence and otherwise fallback to polling the dmabuf directly.

This should already be doing that. If there's an acquire point, it adds a DrmSyncPointBlocker and skips adding the DmabufBlocker.

Yeah, I am pretty sure smithay's code isn't correct in what it does today, but given direct-scanout mostly works, that is probably using implicit-sync in the background.

I'm not sure if implicit sync does does something to help with releases (blocking client writes to the buffer until the display controller is done with the buffer... but yeah, it does seem to mostly work. If implicit sync isn't involved, this won't be more problematic with explicit sync and the same limitation.

(Maybe there is even kernel api to "signal once this other fence is signalled"?)

It should be possible to do something with drmSyncobjTransfer and such. Probably the other implementations of the protocol do something like that, so we can look at those.

We just have to make sure to only do that once the buffer is no longer used elsewhere in the compositor.

Drakulix · 2024-04-11T11:32:16Z

I'm not sure if implicit sync does does something to help with releases (blocking client writes to the buffer until the display controller is done with the buffer... but yeah, it does seem to mostly work. If implicit sync isn't involved, this won't be more problematic with explicit sync and the same limitation.

It definitely does, but the nvidia-driver at least isn't doing that correctly, when you send dmabufs directly to KMS without going through egl-gbm and allocating a EGLSurface. Which is why we have the needs_sync workaround in smithay. But this should work for mesa.

(Maybe there is even kernel api to "signal once this other fence is signalled"?)

It should be possible to do something with drmSyncobjTransfer and such. Probably the other implementations of the protocol do something like that, so we can look at those.

👍

We just have to make sure to only do that once the buffer is no longer used elsewhere in the compositor.

I am pretty sure, that is what our current release logic does, with the exception of handling direct-scanout. Which would be handled by the fence anywhere in this case, so that should be correct. We just need to adjust the DrmCompositor to extract the out-fence and expose that in the RenderResult.

ids1024 · 2024-04-11T15:05:33Z

I am pretty sure, that is what our current release logic does, with the exception of handling direct-scanout. Which would be handled by the fence anywhere in this case, so that should be correct. We just need to adjust the DrmCompositor to extract the out-fence and expose that in the RenderResult.

I mean if we want to have OUT_FENCE_PTR directly signal the release point:

We can't do that if we are using the same buffer for direct scanout or rendering elsewhere (on a different monitor; screencopy)
We can't do that if we might use the buffer again. Which could even be the case if another buffer has been committed, since the new buffer may still be blocked when we want to render the next frame.

So I'm not sure when we could actually do that? Maybe with commit-queue-v1, where we might know the next buffer is ready, but aren't using it until the next frame.

So at least for now I think we need to stick to signaling the release point from CPU? But should still track when OUT_FENCE_PTR has signaled scanout is done with the buffer.

Drakulix · 2024-04-11T15:18:30Z

I mean if we want to have OUT_FENCE_PTR directly signal the release point:
* We can't do that if we are using the same buffer for direct scanout or rendering elsewhere (on a different monitor; screencopy)

Right, so this rather needs to be a list of fences to wait for. Meaning we probably have to wait and signal ourselves instead of relying on drmSyncobjTransfer.

* We can't do that if we might use the buffer again. Which could even be the case if another buffer has been committed, since the new buffer may still be blocked when we want to render the next frame.

But merge of the state should only happen, once all blockers are resolved. And only then we release, so I believe that issue is already handled correctly. Nothing would be able to use that buffer for rendering any more at that point.

But we still need to track the buffer to be able to signal later, so we might as well unify the approach and handle the release-event the same. I feel like this could benefit from some infrastructure and refactoring in smithay.

So I'm not sure when we could actually do that? Maybe with commit-queue-v1, where we might know the next buffer is ready, but aren't using it until the next frame.

I think we can implement both fifo and commit-queue with blockers as well.

So at least for now I think we need to stick to signaling the release point from CPU? But should still track when OUT_FENCE_PTR has signaled scanout is done with the buffer.

Yeah, I am coming to the same conclusion, but that isn't too bad, as that is just another fd in the loop and a very small action.

Update

Drakulix · 2024-05-23T09:12:09Z

src/backend/kms/mod.rs

@@ -450,6 +451,9 @@ pub fn init_backend(
    // Create relative pointer global
    RelativePointerManagerState::new::<State>(&dh);

+    // TODO check if main device supports syncobj eventfd?


I guess worst case we can still fail the import_timeline-request, right?
Is Xwayland/mesa able to handle this / fallback to implicit sync?

I don't think there's any way to fallback once we expose the global. The protocol doesn't make import_timeline failable, except as a protocol error.

It looks like Mutter uses drmSyncobjEventfd (drm_fd, 0, 0, -1, 0) != -1 || errno != ENOENT) to check if the call is supported. So we probably want to do the same, and only create the global if it is supported.

This was referenced Apr 12, 2024

NVIDIA instability with fullscreen apps #419

Open

Add support for linux-drm-syncobj-v1 #409

Open

Drakulix force-pushed the master branch from 58624bd to 99fa371 Compare May 15, 2024 07:37

ids1024 force-pushed the drm-syncobj branch 3 times, most recently from b73721a to 7c700d5 Compare May 20, 2024 23:37

WIP linux-drm-syncobj-v1

8d00edc

Update

ids1024 force-pushed the drm-syncobj branch from 905893f to 8d00edc Compare May 23, 2024 01:35

Drakulix reviewed May 23, 2024

View reviewed changes

mmstick mentioned this pull request May 28, 2024

Panel leaves artifacts when when disabling the "Extend panel to screen edges" option pop-os/cosmic-panel#189

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP linux-drm-syncobj-v1 #411

WIP linux-drm-syncobj-v1 #411

ids1024 commented Apr 9, 2024

Drakulix commented Apr 10, 2024

ids1024 commented Apr 10, 2024

Drakulix commented Apr 11, 2024

ids1024 commented Apr 11, 2024

Drakulix commented Apr 11, 2024

Drakulix May 23, 2024

ids1024 May 23, 2024

WIP linux-drm-syncobj-v1 #411

Are you sure you want to change the base?

WIP linux-drm-syncobj-v1 #411

Conversation

ids1024 commented Apr 9, 2024

Drakulix commented Apr 10, 2024

ids1024 commented Apr 10, 2024

Drakulix commented Apr 11, 2024

ids1024 commented Apr 11, 2024

Drakulix commented Apr 11, 2024

Drakulix May 23, 2024

Choose a reason for hiding this comment

ids1024 May 23, 2024

Choose a reason for hiding this comment