Skip to content

[SROA] Support load-only promotion with dynamic offset loads #135609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 17, 2025

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Apr 14, 2025

If we do load-only promotion, it is okay if we leave some loads alone. We only need to know all stores that affect a specific location.

As such, we can handle loads with unknown offset via the "escaped read-only" code path.

This is something we already support in LICM load-only promotion, but doing this in SROA is much better from a phase ordering perspective.

Fixes #134513.

If we do load-only promotion, it is okay if we leave some loads
alone. We only need to know all stores that affect a specific
location.

As such, we can handle loads with unknown offset via the "escaped
read-only" code path.

This is something we already support in LICM load-only promotion,
but doing this in SROA is much better from a phase ordering
perspective.

Fixes llvm#134513.
@llvmbot
Copy link
Member

llvmbot commented Apr 14, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

If we do load-only promotion, it is okay if we leave some loads alone. We only need to know all stores that affect a specific location.

As such, we can handle loads with unknown offset via the "escaped read-only" code path.

This is something we already support in LICM load-only promotion, but doing this in SROA is much better from a phase ordering perspective.

Fixes #134513.


Full diff: https://github.com/llvm/llvm-project/pull/135609.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/SROA.cpp (+3-1)
  • (modified) llvm/test/Transforms/SROA/readonlynocapture.ll (+6-2)
diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp b/llvm/lib/Transforms/Scalar/SROA.cpp
index 7d49b63a8e4f6..a4e373d395b90 100644
--- a/llvm/lib/Transforms/Scalar/SROA.cpp
+++ b/llvm/lib/Transforms/Scalar/SROA.cpp
@@ -1114,8 +1114,10 @@ class AllocaSlices::SliceBuilder : public PtrUseVisitor<SliceBuilder> {
     assert((!LI.isSimple() || LI.getType()->isSingleValueType()) &&
            "All simple FCA loads should have been pre-split");
 
+    // If there is a load with an unknown offset, we can still perform store
+    // to load forwarding for other known-offset loads.
     if (!IsOffsetKnown)
-      return PI.setAborted(&LI);
+      return PI.setEscapedReadOnly(&LI);
 
     TypeSize Size = DL.getTypeStoreSize(LI.getType());
     if (Size.isScalable())
diff --git a/llvm/test/Transforms/SROA/readonlynocapture.ll b/llvm/test/Transforms/SROA/readonlynocapture.ll
index 90ca72462f7c1..5752fadd76d48 100644
--- a/llvm/test/Transforms/SROA/readonlynocapture.ll
+++ b/llvm/test/Transforms/SROA/readonlynocapture.ll
@@ -501,6 +501,8 @@ define i32 @twoalloc_with_lifetimes() {
 
 declare void @use.i32(i32)
 
+; We can promote the %i load, even though there is an unknown offset load
+; in the loop. It is sufficient that we know all stores.
 define void @load_dyn_offset(ptr %ary) {
 ; CHECK-LABEL: @load_dyn_offset(
 ; CHECK-NEXT:    [[A:%.*]] = alloca { i64, [4 x i32] }, align 8
@@ -509,8 +511,8 @@ define void @load_dyn_offset(ptr %ary) {
 ; CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr [[GEP]], ptr [[ARY:%.*]], i64 16, i1 false)
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
-; CHECK-NEXT:    [[I:%.*]] = load i64, ptr [[A]], align 4
-; CHECK-NEXT:    [[I_NEXT:%.*]] = add i64 [[I]], 1
+; CHECK-NEXT:    [[I:%.*]] = phi i64 [ [[I_NEXT:%.*]], [[LOOP]] ], [ 0, [[TMP0:%.*]] ]
+; CHECK-NEXT:    [[I_NEXT]] = add i64 [[I]], 1
 ; CHECK-NEXT:    store i64 [[I_NEXT]], ptr [[A]], align 4
 ; CHECK-NEXT:    [[GEP_I:%.*]] = getelementptr i32, ptr [[GEP]], i64 [[I]]
 ; CHECK-NEXT:    [[VAL:%.*]] = load i32, ptr [[GEP_I]], align 4
@@ -540,6 +542,8 @@ exit:
   ret void
 }
 
+; Same as previous test, but with an unknown-offset store. We can't promote in
+; that case.
 define void @store_dyn_offset(ptr %ary) {
 ; CHECK-LABEL: @store_dyn_offset(
 ; CHECK-NEXT:    [[A:%.*]] = alloca { i64, [4 x i32] }, align 8

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice catch :)

@nikic nikic merged commit 5c97397 into llvm:main Apr 17, 2025
12 of 13 checks passed
@nikic nikic deleted the sroa-dyn-offset branch April 17, 2025 08:42
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…5609)

If we do load-only promotion, it is okay if we leave some loads alone.
We only need to know all stores that affect a specific location.

As such, we can handle loads with unknown offset via the "escaped
read-only" code path.

This is something we already support in LICM load-only promotion, but
doing this in SROA is much better from a phase ordering perspective.

Fixes llvm#134513.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto-vectorization via masked.load blocks constprop
3 participants