Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient code generation of @mod and @divFloor #19935

Open
ListeriaM opened this issue May 10, 2024 · 4 comments
Open

Inefficient code generation of @mod and @divFloor #19935

ListeriaM opened this issue May 10, 2024 · 4 comments
Labels
bug Observed behavior contradicts documented or intended behavior

Comments

@ListeriaM
Copy link

Zig Version

0.12.0

Steps to Reproduce and Observed Behavior

When using signed integers and power of 2 denominators, the generated assembly contains a bunch of unnecessary instructions (Compiler Explorer)

Expected Behavior

A simple and or sar (for @mod or @divFloor respectively) instruction would be enough in this case (Compiler Explorer). Note that the example only works for comptime_int, but I'd expect it to work for any comptime known value regardless of the type. Ideally such a workaround wouldn't be necessary and I'd use @mod and @divFloor directly

@ListeriaM ListeriaM added the bug Observed behavior contradicts documented or intended behavior label May 10, 2024
@xdBronch
Copy link
Contributor

xdBronch commented May 10, 2024

i dont think these transformations are valid when the number is negative. depending on what youre doing and what you can guarantee, using @divExact or adding if (num < 0) unreachable; will give the codegen youre asking for. in any case unless you see an inefficiency in the LLVM IR zig emits, optimizations like this are generally on LLVM not us

for comparison heres the equivalent in C https://godbolt.org/z/EnP6MEaP1 in all cases its identical when you use the builtins that map to what C does

@ListeriaM
Copy link
Author

ListeriaM commented May 10, 2024

that's not quite the same, these transformations work for negative numerators as well (note that it's using sar instead of shr), I can't speak for the LLVM IR as I'm not as well versed in it. In any case, I'm not sure we should expect LLVM to be able to find a transformation like that.

@xdBronch
Copy link
Contributor

hm yeah my bad that looks right. i dont know where the logic in zig is for this stuff or if it generally attempts any kind of optimization on its own.

@ListeriaM
Copy link
Author

ListeriaM commented May 10, 2024

In case it helps, I uploaded the code I was writing when I found this to a repo, which has 2 branches, one uses the builtins and the other uses the alternatives mentioned here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior
Projects
None yet
Development

No branches or pull requests

2 participants