-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arch-riscv: add agnostic option to vector tail/mask policy for mem and arith instructions #1135
base: develop
Are you sure you want to change the base?
Conversation
@hnpl, if you have time, could you please review this PR? Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me. I agree with the use of pinned writes as it allows parallel writes and avoids register renaming. This is probably more realistic than the current implementation.
I don't have tests at my disposal to test the new implementation of the affected instructions. Please let me know if you've tested those instructions!
Hi. I have at least tested one instruction from each of the arithmetic instruction formats, and on the mem side I've tested them all in theory. |
That's great! |
22106ed
to
25b3e6e
Compare
Change-Id: I567a110806b77d5576810706bd3e30185b0e0b75
25b3e6e
to
7354a22
Compare
On our long-running simulations we have hit a bug. @saul44203 has identified the issue. I would hold the merge until fixed. |
Change-Id: I693b5f3a6cc8a8f320be26b214fd9b359e541f14
7354a22
to
b779062
Compare
@saul44203 , @aarmejach : Since you are still working on this, should we remove it from the goals for this release? |
Yes, we are probably close, but since time is tight I think we should not rush and drop it. |
These two commits add agnostic capability for both tail/mask policies, for vector memory and arithmetic instructions respectively. The common policy for instructions is to act as undisturbed if one is (i.e. tail or mask), or write all 1s if none.
For those instructions in which multiple micro instructions are instantiated to write to the same register (
VlStride
andVlIndex
for memory, andVectorGather
,VectorSlideUp
andVectorSlideDown
for arithmetic), a (new) micro instruction namedVPinVdCpyVsMicroInst
has been used to pin the destination register so that there's no need to copy the partial results between them. This idea is similar to what's on ARM's SVE code. This micro also implements the tail/mask policy for this cases.Finally, it's worth noting that while now using an agnostic policy for both tail/mask should remove all dependencies with old destination registers, there's an exception with
VectorSlideUp
. Thevslideup_{vx,vi}
instructions need the elements in the offset to be unchanged. The current implementation overrides the current vta/vma and makes them act as undisturbed, since they require the old destination register anyways. There's a minor issue with this though, asv{,f}slide1up
variants do not need this, but since they share the same constructor, will act all the same.Related issue #997.