Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL2020] Implement sycl::atomic_fence #1031

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

nmnobre
Copy link
Member

@nmnobre nmnobre commented May 1, 2023

Hi Aksel,

Less than a year later, I'm trying to bring it full circle here. 馃コ
This (properly) closes #767.

Let me know what you think so far.
I haven't implemented the SSCP path because I could only find existing implementations for barriers but not memory fences, and so I think I should wait for you to tell me where and how you want this. 馃檪

Cheers,
-N

@nmnobre nmnobre changed the title Implement sycl::atomic_fence [SYCL 2020] Implement sycl::atomic_fence May 1, 2023
@nmnobre nmnobre changed the title [SYCL 2020] Implement sycl::atomic_fence [SYCL2020] Implement sycl::atomic_fence May 1, 2023
@nmnobre nmnobre requested a review from illuhad May 1, 2023 15:40
Copy link
Collaborator

@illuhad illuhad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to work on this! There's already https://github.com/OpenSYCL/OpenSYCL/blob/develop/include/hipSYCL/sycl/libkernel/detail/mem_fence.hpp
which does basically the same thing, except that it comes from the SYCL 1.2.1 days. Can we implement the old mem_fence on top of the newer one?

For SSCP I would suggest looking at barrier - every barrier is effectively a combination of a memfence and a blocking synchronization primitive. So probably we can factor the mem_fence out from there :)

// older, so comment out that statement for now.
//__hipsycl_if_target_host(/* todo */);
}
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of these structs, when we do runtime if-else in atomic_fence() anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little perhaps.
It was mostly to keep it similar to mem_fence.hpp and perhaps bring more clarity to which of the order and scope combinations we actually support.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely rewritten, structs gone.

@nmnobre
Copy link
Member Author

nmnobre commented May 3, 2023

Can we implement the old mem_fence on top of the newer one?

Done.

For SSCP I would suggest looking at barrier - every barrier is effectively a combination of a memfence and a blocking synchronization primitive. So probably we can factor the mem_fence out from there :)

Done.

Can I get the declaration of __spirv_MemoryBarrier from spirv_ops.hpp in fence.cpp or better not? I did not, just in case we want a clear separation.

As written, the semantics for seq_cst + device for explicit multipass SPIR-V and SSCP SPIR-V is different as the memory semantics flags differ. I suspect the former includes MemorySemanticsMask::WorkgroupMemory because mem_fence could (and still can) use fence_space::global_and_local, but we were using it even with fence_space::global_space. In the barrier implementation we do not include MemorySemanticsMask::WorkgroupMemory. What should we do here?

I've completely rewritten my implementation so it reflects my understanding of the intended semantics.
There are now no differences for SPIR-V.
For HIP-like devices, the SSCP path might be more precise, in the sense that the IR fence primitives we end up calling expose more choices than just __threadfence_block() and __threadfence().
It may well be those IR primitives only implement sequential consistency at the workgroup and device scopes as well - I dunno - but we don't care about that... Indeed, if that were to change, SSCP compilations would automatically be taking advantage.

@nmnobre nmnobre force-pushed the atomic_fence branch 3 times, most recently from 6ebbc85 to b4e5618 Compare May 5, 2023 12:46
@nmnobre nmnobre requested a review from illuhad May 5, 2023 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alternatives for non-standard sub_group::shuffle_up/down/xor and yet-to-be-implemented atomic_fence
2 participants