Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Failure to factorize add/sub and max/min using distributivity #92433

Open
Kmeakin opened this issue May 16, 2024 · 7 comments
Open
Assignees
Labels
good first issue https://github.com/llvm/llvm-project/contribute llvm:instcombine missed-optimization

Comments

@Kmeakin
Copy link
Contributor

Kmeakin commented May 16, 2024

alive proof

use std::cmp::max;
use std::cmp::min;

// * `(a umin c) umax (b umin c) => (a umax b) umin c`
#[no_mangle]
pub fn src1(a: u32, b: u32, c: u32) -> u32 {
    max(min(a, c), min(b, c))
}
#[no_mangle]
pub fn tgt1(a: u32, b: u32, c: u32) -> u32 {
    min(max(a, b), c)
}

// * `(a umax c) umin (b umax c) => (a umin b) umax c`
#[no_mangle]
pub fn src2(a: u32, b: u32, c: u32) -> u32 {
    min(max(a, c), max(b, c))
}
#[no_mangle]
pub fn tgt2(a: u32, b: u32, c: u32) -> u32 {
    max(min(a, b), c)
}

// * `(a smin c) smax (b smin c) => (a smax b) smin c`
#[no_mangle]
pub fn src3(a: i32, b: i32, c: i32) -> i32 {
    max(min(a, c), min(b, c))
}
#[no_mangle]
pub fn tgt3(a: i32, b: i32, c: i32) -> i32 {
    min(max(a, b), c)
}

// * `(a smax c) smin (b smax c) => (a smin b) smax c`
#[no_mangle]
pub fn src4(a: i32, b: i32, c: i32) -> i32 {
    min(max(a, c), max(b, c))
}
#[no_mangle]
pub fn tgt4(a: i32, b: i32, c: i32) -> i32 {
    max(min(a, b), c)
}

// * `umax(a +usat b, a +usat c) => a +usat (b umax c)`
#[no_mangle]
pub fn src5(a: u32, b: u32, c: u32) -> u32 {
    max(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt5(a: u32, b: u32, c: u32) -> u32 {
    a.saturating_add(max(b, c))
}

// * `umax(a +nuw b, a +nuw c) => a +nuw (b umax c)`
#[no_mangle]
pub fn src6(a: u32, b: u32, c: u32) -> u32 {
    unsafe { max(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt6(a: u32, b: u32, c: u32) -> u32 {
    unsafe { a.unchecked_add(max(b, c)) }
}

// * `umin(a +usat b, a +usat c) => a +usat (b umin c)`
#[no_mangle]
pub fn src7(a: u32, b: u32, c: u32) -> u32 {
    min(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt7(a: u32, b: u32, c: u32) -> u32 {
    a.saturating_add(min(b, c))
}

// * `umin(a +nuw b, a +nuw c) => a +nuw (b umin c)`
#[no_mangle]
pub fn src8(a: u32, b: u32, c: u32) -> u32 {
    unsafe { min(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt8(a: u32, b: u32, c: u32) -> u32 {
    unsafe { a.unchecked_add(min(b, c)) }
}

// * `smax(a +ssat b, a +ssat c) => a +ssat (b smax c)`
#[no_mangle]
pub fn src10(a: i32, b: i32, c: i32) -> i32 {
    max(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt10(a: i32, b: i32, c: i32) -> i32 {
    a.saturating_add(max(b, c))
}

// * `smax(a +nsw b, a +nsw c) => a +nsw (b smax c)`
#[no_mangle]
pub fn src9(a: i32, b: i32, c: i32) -> i32 {
    unsafe { max(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt9(a: i32, b: i32, c: i32) -> i32 {
    unsafe { a.unchecked_add(max(b, c)) }
}

// * `smin(a +ssat b, a +ssat c) => a +ssat (b smin c)`
#[no_mangle]
pub fn src11(a: i32, b: i32, c: i32) -> i32 {
    min(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt11(a: i32, b: i32, c: i32) -> i32 {
    a.saturating_add(min(b, c))
}

// * `smin(a +nsw b, a +nsw c) => a +nsw (b smin c)`
#[no_mangle]
pub fn src12(a: i32, b: i32, c: i32) -> i32 {
    unsafe { min(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt12(a: i32, b: i32, c: i32) -> i32 {
    unsafe { a.unchecked_add(min(b, c)) }
}
@Kmeakin
Copy link
Contributor Author

Kmeakin commented May 16, 2024

Similar rewrites may be possible for floating point with appropriate fast-math flags, but alive times out

@dtcxzyw
Copy link
Member

dtcxzyw commented May 17, 2024

I confirmed that src6/src8/src9/src12 exists in some real-world applications.

@dtcxzyw dtcxzyw added the good first issue https://github.com/llvm/llvm-project/contribute label May 17, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented May 17, 2024

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

  1. Check that no other contributor has already been assigned to this issue. If you believe that no one is actually working on it despite an assignment, ping the person. After one week without a response, the assignee may be changed.
  2. In the comments of this issue, request for it to be assigned to you, or just create a pull request after following the steps below. Mention this issue in the description of the pull request.
  3. Fix the issue locally.
  4. Run the test suite locally. Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests.
  5. Create a Git commit.
  6. Run git clang-format HEAD~1 to format your changes.
  7. Open a pull request to the upstream repository on GitHub. Detailed instructions can be found in GitHub's documentation. Mention this issue in the description of the pull request.

If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.

@llvmbot
Copy link
Collaborator

llvmbot commented May 17, 2024

@llvm/issue-subscribers-good-first-issue

Author: Karl Meakin (Kmeakin)

[alive proof](https://alive2.llvm.org/ce/z/2t4UfC)
use std::cmp::max;
use std::cmp::min;

// * `(a umin c) umax (b umin c) => (a umax b) umin c`
#[no_mangle]
pub fn src1(a: u32, b: u32, c: u32) -> u32 {
    max(min(a, c), min(b, c))
}
#[no_mangle]
pub fn tgt1(a: u32, b: u32, c: u32) -> u32 {
    min(max(a, b), c)
}

// * `(a umax c) umin (b umax c) => (a umin b) umax c`
#[no_mangle]
pub fn src2(a: u32, b: u32, c: u32) -> u32 {
    min(max(a, c), max(b, c))
}
#[no_mangle]
pub fn tgt2(a: u32, b: u32, c: u32) -> u32 {
    max(min(a, b), c)
}

// * `(a smin c) smax (b smin c) => (a smax b) smin c`
#[no_mangle]
pub fn src3(a: i32, b: i32, c: i32) -> i32 {
    max(min(a, c), min(b, c))
}
#[no_mangle]
pub fn tgt3(a: i32, b: i32, c: i32) -> i32 {
    min(max(a, b), c)
}

// * `(a smax c) smin (b smax c) => (a smin b) smax c`
#[no_mangle]
pub fn src4(a: i32, b: i32, c: i32) -> i32 {
    min(max(a, c), max(b, c))
}
#[no_mangle]
pub fn tgt4(a: i32, b: i32, c: i32) -> i32 {
    max(min(a, b), c)
}

// * `umax(a +usat b, a +usat c) => a +usat (b umax c)`
#[no_mangle]
pub fn src5(a: u32, b: u32, c: u32) -> u32 {
    max(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt5(a: u32, b: u32, c: u32) -> u32 {
    a.saturating_add(max(b, c))
}

// * `umax(a +nuw b, a +nuw c) => a +nuw (b umax c)`
#[no_mangle]
pub fn src6(a: u32, b: u32, c: u32) -> u32 {
    unsafe { max(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt6(a: u32, b: u32, c: u32) -> u32 {
    unsafe { a.unchecked_add(max(b, c)) }
}

// * `umin(a +usat b, a +usat c) => a +usat (b umin c)`
#[no_mangle]
pub fn src7(a: u32, b: u32, c: u32) -> u32 {
    min(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt7(a: u32, b: u32, c: u32) -> u32 {
    a.saturating_add(min(b, c))
}

// * `umin(a +nuw b, a +nuw c) => a +nuw (b umin c)`
#[no_mangle]
pub fn src8(a: u32, b: u32, c: u32) -> u32 {
    unsafe { min(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt8(a: u32, b: u32, c: u32) -> u32 {
    unsafe { a.unchecked_add(min(b, c)) }
}

// * `smax(a +ssat b, a +ssat c) => a +ssat (b smax c)`
#[no_mangle]
pub fn src10(a: i32, b: i32, c: i32) -> i32 {
    max(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt10(a: i32, b: i32, c: i32) -> i32 {
    a.saturating_add(max(b, c))
}

// * `smax(a +nsw b, a +nsw c) => a +nsw (b smax c)`
#[no_mangle]
pub fn src9(a: i32, b: i32, c: i32) -> i32 {
    unsafe { max(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt9(a: i32, b: i32, c: i32) -> i32 {
    unsafe { a.unchecked_add(max(b, c)) }
}

// * `smin(a +ssat b, a +ssat c) => a +ssat (b smin c)`
#[no_mangle]
pub fn src11(a: i32, b: i32, c: i32) -> i32 {
    min(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt11(a: i32, b: i32, c: i32) -> i32 {
    a.saturating_add(min(b, c))
}

// * `smin(a +nsw b, a +nsw c) => a +nsw (b smin c)`
#[no_mangle]
pub fn src12(a: i32, b: i32, c: i32) -> i32 {
    unsafe { min(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt12(a: i32, b: i32, c: i32) -> i32 {
    unsafe { a.unchecked_add(min(b, c)) }
}

@Kmeakin
Copy link
Contributor Author

Kmeakin commented May 17, 2024

This also works for unsigned multiplication (but not signed): https://alive2.llvm.org/ce/z/jJE_FE

use std::cmp::max;
use std::cmp::min;

// * `umin(a *nuw b, a *nuw c) => a *nuw (b umin c)`
#[no_mangle]
pub unsafe fn src1(a: u8, b: u8, c: u8) -> u8 {
    min(a.unchecked_mul(b), a.unchecked_mul(c))
}

#[no_mangle]
pub unsafe fn tgt1(a: u8, b: u8, c: u8) -> u8 {
    a.unchecked_mul(min(b, c))
}

// * `umin(a *usat b, a *usat c) => a *usat (b umin c)`
#[no_mangle]
pub unsafe fn src2(a: u8, b: u8, c: u8) -> u8 {
    min(a.saturating_mul(b), a.unchecked_mul(c))
}

#[no_mangle]
pub unsafe fn tgt2(a: u8, b: u8, c: u8) -> u8 {
    a.saturating_mul(min(b, c))
}

// * `umax(a *nuw b, a *nuw c) => a *nuw (b umax c)`
#[no_mangle]
pub unsafe fn src3(a: u8, b: u8, c: u8) -> u8 {
    max(a.unchecked_mul(b), a.unchecked_mul(c))
}

#[no_mangle]
pub unsafe fn tgt3(a: u8, b: u8, c: u8) -> u8 {
    a.unchecked_mul(max(b, c))
}

// * `umax(a *usat b, a *usat c) => a *usat (b umax c)`
#[no_mangle]
pub unsafe fn src4(a: u8, b: u8, c: u8) -> u8 {
    max(a.saturating_mul(b), a.unchecked_mul(c))
}

#[no_mangle]
pub unsafe fn tgt4(a: u8, b: u8, c: u8) -> u8 {
    a.saturating_mul(max(b, c))
}

@jf-botto
Copy link

I'd love to work on this if possible?

@dtcxzyw
Copy link
Member

dtcxzyw commented May 20, 2024

I'd love to work on this if possible?

Sure! Welcome to LLVM!

Please read https://llvm.org/docs/InstCombineContributorGuide.html before submitting your patch :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue https://github.com/llvm/llvm-project/contribute llvm:instcombine missed-optimization
Projects
None yet
Development

No branches or pull requests

4 participants