[InstCombine] Remove one-use restriction on icmp of gep fold #76730

nikic · 2024-01-02T16:02:18Z

The fold for icmp (gep (p, i1), gep (p, i2)) to icmp (i1, i2) is currently limited to one of the GEPs either having one use or a constant offset. I believe this is to avoid duplicating complex arithmetic both in the GEP and the offset comparison.

This patch instead does the same thing that the indexed compare fold does, which is to rewrite the GEP into i8 form if necessary, so that the offset arithmetic is not repeated after the transform.

I ran into this problem in a case where there are multiple conditions on the same pointer, which prevents them from getting folded.

The fold for icmp (gep (p, i1), gep (p, i2)) to icmp (i1, i2) is currently limited to one of the GEPs either having one use or a constant offset. I believe this is to avoid duplicating complex arithmetic both in the GEP and the offset comparison. This patch instead does the same thing that the indexed compare fold does, which is to rewrite the GEP into i8 form if necessary, so that the offset arithmetic is not repeated after the transform.

llvmbot · 2024-01-02T16:02:44Z

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

The fold for icmp (gep (p, i1), gep (p, i2)) to icmp (i1, i2) is currently limited to one of the GEPs either having one use or a constant offset. I believe this is to avoid duplicating complex arithmetic both in the GEP and the offset comparison.

This patch instead does the same thing that the indexed compare fold does, which is to rewrite the GEP into i8 form if necessary, so that the offset arithmetic is not repeated after the transform.

I ran into this problem in a case where there are multiple conditions on the same pointer, which prevents them from getting folded.

Full diff: https://github.com/llvm/llvm-project/pull/76730.diff

3 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+21-7)
(modified) llvm/test/Transforms/InstCombine/icmp-custom-dl.ll (+2-2)
(modified) llvm/test/Transforms/InstCombine/icmp-gep.ll (+10-7)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 3875e59c3ede3b..ce928e40cdade0 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -813,14 +813,28 @@ Instruction *InstCombinerImpl::foldGEPICmp(GEPOperator *GEPLHS, Value *RHS,
       }
     }
 
-    // Only lower this if the icmp is the only user of the GEP or if we expect
-    // the result to fold to a constant!
-    if ((GEPsInBounds || CmpInst::isEquality(Cond)) &&
-        (GEPLHS->hasAllConstantIndices() || GEPLHS->hasOneUse()) &&
-        (GEPRHS->hasAllConstantIndices() || GEPRHS->hasOneUse())) {
+    if (GEPsInBounds || CmpInst::isEquality(Cond)) {
+      auto EmitGEPOffsetAndRewrite = [&](GEPOperator *GEP) {
+        auto *Inst = dyn_cast<Instruction>(GEP);
+        if (Inst)
+          Builder.SetInsertPoint(Inst);
+
+        Value *Offset = EmitGEPOffset(GEP);
+        // If a non-trivial GEP has other uses, rewrite it to avoid duplicating
+        // the offset arithmetic.
+        if (Inst && !GEP->hasOneUse() && !GEP->hasAllConstantIndices()) {
+          replaceInstUsesWith(*Inst,
+                              Builder.CreateGEP(Builder.getInt8Ty(),
+                                                GEP->getPointerOperand(),
+                                                Offset, "", GEPsInBounds));
+          eraseInstFromFunction(*Inst);
+        }
+        return Offset;
+      };
+
       // ((gep Ptr, OFFSET1) cmp (gep Ptr, OFFSET2)  --->  (OFFSET1 cmp OFFSET2)
-      Value *L = EmitGEPOffset(GEPLHS);
-      Value *R = EmitGEPOffset(GEPRHS);
+      Value *L = EmitGEPOffsetAndRewrite(GEPLHS);
+      Value *R = EmitGEPOffsetAndRewrite(GEPRHS);
       return new ICmpInst(ICmpInst::getSignedPredicate(Cond), L, R);
     }
   }
diff --git a/llvm/test/Transforms/InstCombine/icmp-custom-dl.ll b/llvm/test/Transforms/InstCombine/icmp-custom-dl.ll
index 330825e1055eaa..f0e8962b1b6a64 100644
--- a/llvm/test/Transforms/InstCombine/icmp-custom-dl.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-custom-dl.ll
@@ -40,8 +40,8 @@ define i1 @test59_as1(ptr addrspace(1) %foo) {
 define i1 @test60(ptr %foo, i64 %i, i64 %j) {
 ; CHECK-LABEL: @test60(
 ; CHECK-NEXT:    [[TMP1:%.*]] = trunc i64 [[I:%.*]] to i32
-; CHECK-NEXT:    [[TMP2:%.*]] = trunc i64 [[J:%.*]] to i32
 ; CHECK-NEXT:    [[GEP1_IDX:%.*]] = shl nsw i32 [[TMP1]], 2
+; CHECK-NEXT:    [[TMP2:%.*]] = trunc i64 [[J:%.*]] to i32
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i32 [[GEP1_IDX]], [[TMP2]]
 ; CHECK-NEXT:    ret i1 [[CMP]]
 ;
@@ -54,8 +54,8 @@ define i1 @test60(ptr %foo, i64 %i, i64 %j) {
 define i1 @test60_as1(ptr addrspace(1) %foo, i64 %i, i64 %j) {
 ; CHECK-LABEL: @test60_as1(
 ; CHECK-NEXT:    [[TMP1:%.*]] = trunc i64 [[I:%.*]] to i16
-; CHECK-NEXT:    [[TMP2:%.*]] = trunc i64 [[J:%.*]] to i16
 ; CHECK-NEXT:    [[GEP1_IDX:%.*]] = shl nsw i16 [[TMP1]], 2
+; CHECK-NEXT:    [[TMP2:%.*]] = trunc i64 [[J:%.*]] to i16
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i16 [[GEP1_IDX]], [[TMP2]]
 ; CHECK-NEXT:    ret i1 [[CMP]]
 ;
diff --git a/llvm/test/Transforms/InstCombine/icmp-gep.ll b/llvm/test/Transforms/InstCombine/icmp-gep.ll
index fb297e04ceb065..42e15e78975e51 100644
--- a/llvm/test/Transforms/InstCombine/icmp-gep.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-gep.ll
@@ -313,8 +313,8 @@ define i1 @test_gep_eq_no_inbounds(ptr %foo, i64 %i, i64 %j) {
 define i1 @test60_as1(ptr addrspace(1) %foo, i64 %i, i64 %j) {
 ; CHECK-LABEL: @test60_as1(
 ; CHECK-NEXT:    [[TMP1:%.*]] = trunc i64 [[I:%.*]] to i16
-; CHECK-NEXT:    [[TMP2:%.*]] = trunc i64 [[J:%.*]] to i16
 ; CHECK-NEXT:    [[GEP1_IDX:%.*]] = shl nsw i16 [[TMP1]], 2
+; CHECK-NEXT:    [[TMP2:%.*]] = trunc i64 [[J:%.*]] to i16
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i16 [[GEP1_IDX]], [[TMP2]]
 ; CHECK-NEXT:    ret i1 [[CMP]]
 ;
@@ -400,11 +400,13 @@ define i1 @test61_as1(ptr addrspace(1) %foo, i16 %i, i16 %j) {
 
 define i1 @test60_extra_use(ptr %foo, i64 %i, i64 %j) {
 ; CHECK-LABEL: @test60_extra_use(
-; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr inbounds i32, ptr [[FOO:%.*]], i64 [[I:%.*]]
-; CHECK-NEXT:    [[GEP2:%.*]] = getelementptr inbounds i16, ptr [[FOO]], i64 [[J:%.*]]
+; CHECK-NEXT:    [[GEP1_IDX:%.*]] = shl nsw i64 [[I:%.*]], 2
+; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr inbounds i8, ptr [[FOO:%.*]], i64 [[GEP1_IDX]]
+; CHECK-NEXT:    [[GEP2_IDX:%.*]] = shl nsw i64 [[J:%.*]], 1
+; CHECK-NEXT:    [[GEP2:%.*]] = getelementptr inbounds i8, ptr [[FOO]], i64 [[GEP2_IDX]]
 ; CHECK-NEXT:    call void @use(ptr [[GEP1]])
 ; CHECK-NEXT:    call void @use(ptr [[GEP2]])
-; CHECK-NEXT:    [[CMP:%.*]] = icmp ult ptr [[GEP1]], [[GEP2]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i64 [[GEP1_IDX]], [[GEP2_IDX]]
 ; CHECK-NEXT:    ret i1 [[CMP]]
 ;
   %gep1 = getelementptr inbounds i32, ptr %foo, i64 %i
@@ -446,13 +448,14 @@ define i1 @test60_extra_use_const_operands_no_inbounds(ptr %foo, i64 %i, i64 %j)
 
 define void @test60_extra_use_fold(ptr %foo, i64 %start.idx, i64 %end.offset) {
 ; CHECK-LABEL: @test60_extra_use_fold(
-; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr inbounds i32, ptr [[FOO:%.*]], i64 [[START_IDX:%.*]]
+; CHECK-NEXT:    [[GEP1_IDX:%.*]] = shl nsw i64 [[START_IDX:%.*]], 2
+; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr inbounds i8, ptr [[FOO:%.*]], i64 [[GEP1_IDX]]
 ; CHECK-NEXT:    [[GEP2:%.*]] = getelementptr inbounds i8, ptr [[FOO]], i64 [[END_OFFSET:%.*]]
 ; CHECK-NEXT:    call void @use(ptr [[GEP1]])
 ; CHECK-NEXT:    call void @use(ptr [[GEP2]])
-; CHECK-NEXT:    [[CMP1:%.*]] = icmp eq ptr [[GEP1]], [[GEP2]]
+; CHECK-NEXT:    [[CMP1:%.*]] = icmp eq i64 [[GEP1_IDX]], [[END_OFFSET]]
 ; CHECK-NEXT:    call void @use.i1(i1 [[CMP1]])
-; CHECK-NEXT:    [[CMP2:%.*]] = icmp ult ptr [[GEP1]], [[GEP2]]
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp slt i64 [[GEP1_IDX]], [[END_OFFSET]]
 ; CHECK-NEXT:    call void @use.i1(i1 [[CMP2]])
 ; CHECK-NEXT:    ret void
 ;

PR Link: llvm/llvm-project#76730

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

dtcxzyw

LGTM. Thanks!

dtcxzyw · 2024-02-05T18:48:02Z

Ping.

nikic · 2024-02-09T20:30:45Z

Huh, this change had a suspiciously large impact on clang/Driver/Toolchains/: https://llvm-compile-time-tracker.com/compare_clang.php?from=b5abaea3c0de605c8145035b21a5ee492883ebd7&to=b1b8a383fcdab007ccd1a5daa08cb33ce7cbc6c0&stat=instructions%3Au&sortBy=interestingness

nikic · 2024-02-13T15:23:11Z

Looks like @_ZNK4llvm3opt7ArgList10getLastArgIJN5clang6driver7options2IDEEEEPNS0_3ArgEDpT_ no longer gets inlined. Probably the different representation pushed this just over an inlining threshold. As inlining this function probably isn't actually worthwhile, this seems fine.

The final version of the function is getting additional simplification relative to the baseline -- one of the pointer comparisons now gets optimized away.

nikic requested a review from dtcxzyw January 2, 2024 16:02

llvmbot added the llvm:transforms label Jan 2, 2024

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Jan 2, 2024

pre-commit: test PR76730

b004749

PR Link: llvm/llvm-project#76730

dtcxzyw mentioned this pull request Jan 2, 2024

pre-commit: test PR76730 dtcxzyw/llvm-opt-benchmark#87

Closed

dtcxzyw reviewed Jan 2, 2024

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Show resolved Hide resolved

Don't recreate GEP if it already has i8 type

efe5e2d

dtcxzyw reviewed Jan 3, 2024

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Show resolved Hide resolved

add insert point guard

a8994d2

dtcxzyw approved these changes Jan 3, 2024

View reviewed changes

nikic merged commit b1b8a38 into llvm:main Feb 9, 2024

nikic deleted the instcombine-icmp-gep branch February 9, 2024 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InstCombine] Remove one-use restriction on icmp of gep fold #76730

[InstCombine] Remove one-use restriction on icmp of gep fold #76730

nikic commented Jan 2, 2024

llvmbot commented Jan 2, 2024

dtcxzyw left a comment

dtcxzyw commented Feb 5, 2024

nikic commented Feb 9, 2024

nikic commented Feb 13, 2024

[InstCombine] Remove one-use restriction on icmp of gep fold #76730

[InstCombine] Remove one-use restriction on icmp of gep fold #76730

Conversation

nikic commented Jan 2, 2024

llvmbot commented Jan 2, 2024

dtcxzyw left a comment

Choose a reason for hiding this comment

dtcxzyw commented Feb 5, 2024

nikic commented Feb 9, 2024

nikic commented Feb 13, 2024