Skip to content

Missed optimization in InstCombine: replace usub with or #56926

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
chfast opened this issue Aug 4, 2022 · 4 comments
Closed

Missed optimization in InstCombine: replace usub with or #56926

chfast opened this issue Aug 4, 2022 · 4 comments

Comments

@chfast
Copy link
Member

chfast commented Aug 4, 2022

When you have usub instructions chained by overflow (borrow) flag (e.g. for big integer "less than") you can replace all except one usubs with or.

define i1 @src(i64 %0, i64 %1, i1 %2) {
  %4 = zext i1 %2 to i64
  %5 = call { i64, i1 } @llvm.usub.with.overflow.i64(i64 %0, i64 %4)
  %6 = extractvalue { i64, i1 } %5, 1
  %7 = zext i1 %6 to i64
  %8 = call { i64, i1 } @llvm.usub.with.overflow.i64(i64 %1, i64 %7)
  %9 = extractvalue { i64, i1 } %8, 1
  ret i1 %9
}

define i1 @tgt(i64 %0, i64 %1, i1 %2) {
  %4 = or i64 %1, %0
  %5 = zext i1 %2 to i64
  %6 = call { i64, i1 } @llvm.usub.with.overflow.i64(i64 %4, i64 %5)
  %7 = extractvalue { i64, i1 } %6, 1
  ret i1 %7
}

declare { i64, i1 } @llvm.usub.with.overflow.i64(i64, i64)

https://alive2.llvm.org/ce/z/yhrBrK

@nikic
Copy link
Contributor

nikic commented Aug 4, 2022

usub.with.overflow where only the overflow result is used should just be replaced with an icmp.

@chfast
Copy link
Member Author

chfast commented Aug 5, 2022

usub.with.overflow where only the overflow result is used should just be replaced with an icmp.

This make sense, although I think this will move us to this bug where subtraction is changed to comparison and in the end the codegen is not able to generate optional sbb instructions: #53432.

@rotateright
Copy link
Contributor

926e731 - both examples should now canonicalize to the same IR:

define i1 @src(i64 %0, i64 %1, i1 %2) {
  %4 = or i64 %1, %0
  %5 = icmp eq i64 %4, 0
  %6 = and i1 %5, %2
  ret i1 %6
}

We should deal with x86 codegen problems independently of that. Repurpose this bug for that or open a new one?

@chfast
Copy link
Member Author

chfast commented Aug 9, 2022

I think this is very good solution for the particular problem. I will play with this once Compiler Explorer is updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants