-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[BasicAA] Do not decompose past casts with different index width #119365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
BasicAA currently tries to support addrspacecasts that change the index width by performing the decomposition in the maximum of all index widths and then trying to fix this up with in-place sign extends to get correct overflow behavior if the actual index width is smaller. However, even in the case where we don't mix different index widths and just have an index width that is smaller than the maximum, the behavior is incorrect (see test), because we only perform the index width adjustment during decomposition and not any of the later logic -- and we don't do anything at all for variable offsets. I'm sure that the case where we actually mix different index widths is even more broken than that. Fix this by not allowing decomposition through index width changes. If the pointers have different index widths, fall back to a base object comparison, ignoring the offsets.
@llvm/pr-subscribers-backend-systemz @llvm/pr-subscribers-llvm-analysis Author: Nikita Popov (nikic) ChangesBasicAA currently tries to support addrspacecasts that change the index width by performing the decomposition in the maximum of all index widths and then trying to fix this up with in-place sign extends to get correct overflow behavior if the actual index width is smaller. However, even in the case where we don't mix different index widths and just have an index width that is smaller than the maximum, the behavior is incorrect (see test), because we only perform the index width adjustment during decomposition and not any of the later logic -- and we don't do anything at all for variable offsets. I'm sure that the case where we actually mix different index widths is even more broken than that. Fix this by not allowing decomposition through index width changes. If the pointers have different index widths, fall back to a base object comparison, ignoring the offsets. Full diff: https://github.com/llvm/llvm-project/pull/119365.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/BasicAliasAnalysis.cpp b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
index dc54d566e55b12..b2a3f3390e000d 100644
--- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
@@ -494,20 +494,6 @@ static LinearExpression GetLinearExpression(
return Val;
}
-/// To ensure a pointer offset fits in an integer of size IndexSize
-/// (in bits) when that size is smaller than the maximum index size. This is
-/// an issue, for example, in particular for 32b pointers with negative indices
-/// that rely on two's complement wrap-arounds for precise alias information
-/// where the maximum index size is 64b.
-static void adjustToIndexSize(APInt &Offset, unsigned IndexSize) {
- assert(IndexSize <= Offset.getBitWidth() && "Invalid IndexSize!");
- unsigned ShiftBits = Offset.getBitWidth() - IndexSize;
- if (ShiftBits != 0) {
- Offset <<= ShiftBits;
- Offset.ashrInPlace(ShiftBits);
- }
-}
-
namespace {
// A linear transformation of a Value; this class represents
// ZExt(SExt(Trunc(V, TruncBits), SExtBits), ZExtBits) * Scale.
@@ -594,9 +580,9 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
SearchTimes++;
const Instruction *CxtI = dyn_cast<Instruction>(V);
- unsigned MaxIndexSize = DL.getMaxIndexSizeInBits();
+ unsigned IndexSize = DL.getIndexTypeSizeInBits(V->getType());
DecomposedGEP Decomposed;
- Decomposed.Offset = APInt(MaxIndexSize, 0);
+ Decomposed.Offset = APInt(IndexSize, 0);
do {
// See if this is a bitcast or GEP.
const Operator *Op = dyn_cast<Operator>(V);
@@ -614,7 +600,14 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
if (Op->getOpcode() == Instruction::BitCast ||
Op->getOpcode() == Instruction::AddrSpaceCast) {
- V = Op->getOperand(0);
+ Value *NewV = Op->getOperand(0);
+ // Don't look through casts between address spaces with differing index
+ // widths.
+ if (DL.getIndexTypeSizeInBits(NewV->getType()) != IndexSize) {
+ Decomposed.Base = V;
+ return Decomposed;
+ }
+ V = NewV;
continue;
}
@@ -651,12 +644,8 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
assert(GEPOp->getSourceElementType()->isSized() && "GEP must be sized");
- unsigned AS = GEPOp->getPointerAddressSpace();
// Walk the indices of the GEP, accumulating them into BaseOff/VarIndices.
gep_type_iterator GTI = gep_type_begin(GEPOp);
- unsigned IndexSize = DL.getIndexSizeInBits(AS);
- // Assume all GEP operands are constants until proven otherwise.
- bool GepHasConstantOffset = true;
for (User::const_op_iterator I = GEPOp->op_begin() + 1, E = GEPOp->op_end();
I != E; ++I, ++GTI) {
const Value *Index = *I;
@@ -684,7 +673,7 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
}
Decomposed.Offset += AllocTypeSize.getFixedValue() *
- CIdx->getValue().sextOrTrunc(MaxIndexSize);
+ CIdx->getValue().sextOrTrunc(IndexSize);
continue;
}
@@ -694,8 +683,6 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
return Decomposed;
}
- GepHasConstantOffset = false;
-
// If the integer type is smaller than the index size, it is implicitly
// sign extended or truncated to index size.
bool NUSW = GEPOp->hasNoUnsignedSignedWrap();
@@ -710,8 +697,8 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
// Scale by the type size.
unsigned TypeSize = AllocTypeSize.getFixedValue();
LE = LE.mul(APInt(IndexSize, TypeSize), NUW, NUSW);
- Decomposed.Offset += LE.Offset.sext(MaxIndexSize);
- APInt Scale = LE.Scale.sext(MaxIndexSize);
+ Decomposed.Offset += LE.Offset;
+ APInt Scale = LE.Scale;
if (!LE.IsNUW)
Decomposed.NWFlags = Decomposed.NWFlags.withoutNoUnsignedWrap();
@@ -731,10 +718,6 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
}
}
- // Make sure that we have a scale that makes sense for this target's
- // index size.
- adjustToIndexSize(Scale, IndexSize);
-
if (!!Scale) {
VariableGEPIndex Entry = {LE.Val, Scale, CxtI, LE.IsNSW,
/* IsNegated */ false};
@@ -742,10 +725,6 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
}
}
- // Take care of wrap-arounds
- if (GepHasConstantOffset)
- adjustToIndexSize(Decomposed.Offset, IndexSize);
-
// Analyze the base pointer next.
V = GEPOp->getOperand(0);
} while (--MaxLookup);
@@ -1084,6 +1063,14 @@ AliasResult BasicAAResult::aliasGEP(
const GEPOperator *GEP1, LocationSize V1Size,
const Value *V2, LocationSize V2Size,
const Value *UnderlyingV1, const Value *UnderlyingV2, AAQueryInfo &AAQI) {
+ auto BaseObjectsAlias = [&]() {
+ AliasResult BaseAlias =
+ AAQI.AAR.alias(MemoryLocation::getBeforeOrAfter(UnderlyingV1),
+ MemoryLocation::getBeforeOrAfter(UnderlyingV2), AAQI);
+ return BaseAlias == AliasResult::NoAlias ? AliasResult::NoAlias
+ : AliasResult::MayAlias;
+ };
+
if (!V1Size.hasValue() && !V2Size.hasValue()) {
// TODO: This limitation exists for compile-time reasons. Relax it if we
// can avoid exponential pathological cases.
@@ -1092,11 +1079,7 @@ AliasResult BasicAAResult::aliasGEP(
// If both accesses have unknown size, we can only check whether the base
// objects don't alias.
- AliasResult BaseAlias =
- AAQI.AAR.alias(MemoryLocation::getBeforeOrAfter(UnderlyingV1),
- MemoryLocation::getBeforeOrAfter(UnderlyingV2), AAQI);
- return BaseAlias == AliasResult::NoAlias ? AliasResult::NoAlias
- : AliasResult::MayAlias;
+ return BaseObjectsAlias();
}
DominatorTree *DT = getDT(AAQI);
@@ -1107,6 +1090,10 @@ AliasResult BasicAAResult::aliasGEP(
if (DecompGEP1.Base == GEP1 && DecompGEP2.Base == V2)
return AliasResult::MayAlias;
+ // Fall back to base objects if pointers have different index widths.
+ if (DecompGEP1.Offset.getBitWidth() != DecompGEP2.Offset.getBitWidth())
+ return BaseObjectsAlias();
+
// Swap GEP1 and GEP2 if GEP2 has more variable indices.
if (DecompGEP1.VarIndices.size() < DecompGEP2.VarIndices.size()) {
std::swap(DecompGEP1, DecompGEP2);
diff --git a/llvm/test/Analysis/BasicAA/smaller-index-size-overflow.ll b/llvm/test/Analysis/BasicAA/smaller-index-size-overflow.ll
index a913a4a9e4b1c8..fc91b577a56cc3 100644
--- a/llvm/test/Analysis/BasicAA/smaller-index-size-overflow.ll
+++ b/llvm/test/Analysis/BasicAA/smaller-index-size-overflow.ll
@@ -2,8 +2,7 @@
target datalayout = "p1:32:32"
-; FIXME: This is a miscompile.
-; CHECK: NoAlias: i32 addrspace(1)* %gep1, i32 addrspace(1)* %gep2
+; CHECK: PartialAlias: i32 addrspace(1)* %gep1, i32 addrspace(1)* %gep2
define void @test(ptr addrspace(1) %p) {
%gep1 = getelementptr i8, ptr addrspace(1) %p, i32 u0x7fffffff
%gep2 = getelementptr i8, ptr addrspace(1) %p, i32 u0x80000001
|
@redstar looks like this disables an optimization on z/OS with mixed pointer widths. Please have a look whether this is relevant. |
@@ -594,9 +580,9 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL, | |||
SearchTimes++; | |||
const Instruction *CxtI = dyn_cast<Instruction>(V); | |||
|
|||
unsigned MaxIndexSize = DL.getMaxIndexSizeInBits(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like getMaxIndexSizeInBits()
is only used here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped the API in 07aab4a.
The last use was removed in #119365, and we should not add more uses of this concept in the future either.
I'm investigating various regressions we've seen downstream after this patch. I reduced one test down to this IR (also see https://godbolt.org/z/E4caeWcYd):
We used to get For our downstream target we kind of now that different address spaces never alias, but unfortunately I don't think BasicAA is aware of such things. Still a bit unsure if those pointers should result in MayAlias now. Maybe we were just lucky in the past that BasicAA returned NoAlias. After this patch we take the early exit for "Fall back to base objects if pointers have different index widths.", while we used to derive NoAlias when checking "if an inbounds GEP would have to start from an out of bounds address for the two to alias, then we can assume noalias." I guess we do not really want to check TargetTransformInfo::addrspacesMayAlias inside BasicAA, which would be a simple way to determine if accesses to different address spaces may alias or not. So a bit unsure what we can do. |
@bjope I believe the standard solution to that is to add your own AA pass, see e.g. AMDGPUAAResult and NVPTXAAResult which both do addrspace-based AA. |
But isn't it still a bit weird that if I change the datalayout to set the same pointer size for the two address spaces, then BasicAA is able to derive that there is NoAlias (https://godbolt.org/z/vhcG4jx7a), and the same goes even if using same address space for both accesses (https://godbolt.org/z/GoEM7zx68). While it can't derive that when the pointer sizes are different. So this patch kind of messed things up for targets that have multiple address spaces, but with different pointer sizes, since the early out for "index size mismatch" is making it skip some checks that could help identify NoAlias. |
It's expected that there are regressions if you are mixing pointers with different index widths. I'm open to ideas, but I don't see any obvious way we can fully support pointers with different index widths while also being correct. A key thing that BasicAA needs is the ability to subtract two GEPs, and I'm not even sure what that is supposed to mean when different index widths are involved (it would be easy if we didn't have to care about overflow, but unfortunately this is something BasicAA must model accurately). It would be easy to support your specific example with a patch like the following, effectively modelling this as a failure to subtract the GEPs. Does that help you in practice? I don't know if your example was over-reduced. From a0ebbaf72745218f89e45e8e7cc4365773673b05 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov@redhat.com>
Date: Thu, 9 Jan 2025 10:09:28 +0100
Subject: [PATCH] gep diff
---
.../llvm/Analysis/BasicAliasAnalysis.h | 2 +-
llvm/lib/Analysis/BasicAliasAnalysis.cpp | 19 ++++++++++++-------
2 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/llvm/include/llvm/Analysis/BasicAliasAnalysis.h b/llvm/include/llvm/Analysis/BasicAliasAnalysis.h
index 7eca82729430..b81324f8da46 100644
--- a/llvm/include/llvm/Analysis/BasicAliasAnalysis.h
+++ b/llvm/include/llvm/Analysis/BasicAliasAnalysis.h
@@ -122,7 +122,7 @@ private:
bool isValueEqualInPotentialCycles(const Value *V1, const Value *V2,
const AAQueryInfo &AAQI);
- void subtractDecomposedGEPs(DecomposedGEP &DestGEP,
+ bool subtractDecomposedGEPs(DecomposedGEP &DestGEP,
const DecomposedGEP &SrcGEP,
const AAQueryInfo &AAQI);
diff --git a/llvm/lib/Analysis/BasicAliasAnalysis.cpp b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
index b2a3f3390e00..70b173a3785b 100644
--- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
@@ -1090,10 +1090,6 @@ AliasResult BasicAAResult::aliasGEP(
if (DecompGEP1.Base == GEP1 && DecompGEP2.Base == V2)
return AliasResult::MayAlias;
- // Fall back to base objects if pointers have different index widths.
- if (DecompGEP1.Offset.getBitWidth() != DecompGEP2.Offset.getBitWidth())
- return BaseObjectsAlias();
-
// Swap GEP1 and GEP2 if GEP2 has more variable indices.
if (DecompGEP1.VarIndices.size() < DecompGEP2.VarIndices.size()) {
std::swap(DecompGEP1, DecompGEP2);
@@ -1102,8 +1098,9 @@ AliasResult BasicAAResult::aliasGEP(
}
// Subtract the GEP2 pointer from the GEP1 pointer to find out their
- // symbolic difference.
- subtractDecomposedGEPs(DecompGEP1, DecompGEP2, AAQI);
+ // symbolic difference. If we can't, fall back to the base object check.
+ if (!subtractDecomposedGEPs(DecompGEP1, DecompGEP2, AAQI))
+ return BaseObjectsAlias();
// If an inbounds GEP would have to start from an out of bounds address
// for the two to alias, then we can assume noalias.
@@ -1843,9 +1840,15 @@ bool BasicAAResult::isValueEqualInPotentialCycles(const Value *V,
}
/// Computes the symbolic difference between two de-composed GEPs.
-void BasicAAResult::subtractDecomposedGEPs(DecomposedGEP &DestGEP,
+bool BasicAAResult::subtractDecomposedGEPs(DecomposedGEP &DestGEP,
const DecomposedGEP &SrcGEP,
const AAQueryInfo &AAQI) {
+ // We do not know how to correctly subtract GEPS with different index widths.
+ // Only handle the case where the RHS is a trivial GEP without indices or
+ // offsets, as there is nothing to subtract in that case.
+ if (DestGEP.Offset.getBitWidth() != SrcGEP.Offset.getBitWidth())
+ return SrcGEP.VarIndices.empty() && SrcGEP.Offset.isZero();
+
// Drop nuw flag from GEP if subtraction of constant offsets overflows in an
// unsigned sense.
if (DestGEP.Offset.ult(SrcGEP.Offset))
@@ -1897,6 +1900,8 @@ void BasicAAResult::subtractDecomposedGEPs(DecomposedGEP &DestGEP,
DestGEP.NWFlags = DestGEP.NWFlags.withoutNoUnsignedWrap();
}
}
+
+ return true;
}
bool BasicAAResult::constantOffsetHeuristic(const DecomposedGEP &GEP,
--
2.47.1 |
Ok. Thanks for quick feedback and clarification that regressions can be expected when having different pointer sizes (or rather different index widths). Btw, I also think the gep with "ptr null" is probably the result of over-reducing the original program. I will need to continue analyzing our regressions, but wanted some feedback on this smaller IR snippet first. I will also look into adding a target specific AA similar to AMDGPUAAResult and NVPTXAAResult for addspace-based AA, I think that could help out eliminating many regressions/diffs that I've seen making it easier to focus on whatever remains. So thanks a lot for those code pointers! |
I'm seeing the opposite problem to @bjope, which is that a previous target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
target triple = "amdgcn-amd-amdhsa"
define amdgpu_kernel void @foo(ptr addrspace(3) noundef align 4 %arg) {
entry:
%1 = load i32, ptr addrspace(3) %arg, align 4
%add.ptr.i.i = getelementptr inbounds nuw i8, ptr addrspace(3) %arg, i64 4
%2 = addrspacecast ptr addrspace(3) %add.ptr.i.i to ptr
%incdec.ptr.i = getelementptr inbounds i8, ptr %2, i64 -4
%3 = load i32, ptr %incdec.ptr.i, align 4
ret void
} (changing to Is this intended behaviour? I can understand relying on target-specific AA results for optimization purposes, but I still expected a conservative result of EDIT: I just noticed that removing |
@frasercrmck BasicAA returns MayAlias for your example. Presumably AMDGPUAA returns NoAlias because addrspace(3) and addrspace(0) can't alias, or something like that. |
Yep, you're right, sorry for the false alarm. Because it's a |
Hi there, It seems like after this pull request, the following C code will crash Clang/LLVM if LLVM assertion is enabled (only on 32-bit targets): Godbolt link: https://godbolt.org/z/91bqnYsb4 #include <stdint.h>
#include <stddef.h>
#if SIZE_MAX != UINT32_MAX
#error "This reproducer only works on 32-bit targets"
#endif
void repro(unsigned char** arg0) {
#pragma clang loop unroll(disable)
for (size_t i = 0; i != INT32_MAX / 2 + 1; i++) {
*arg0[i] = 0;
arg0[i] = 0;
}
} Reduced LLVM IR that can trigger the issue: https://godbolt.org/z/aj364W849 |
If the LocationSize is larger than the index space of the pointer type, bail out instead of triggering an APInt assertion. Fixes the issue reported at #119365 (comment).
If the LocationSize is larger than the index space of the pointer type, bail out instead of triggering an APInt assertion. Fixes the issue reported at llvm/llvm-project#119365 (comment).
If the LocationSize is larger than the index space of the pointer type, bail out instead of triggering an APInt assertion. Fixes the issue reported at llvm#119365 (comment).
If the LocationSize is larger than the index space of the pointer type, bail out instead of triggering an APInt assertion. Fixes the issue reported at llvm#119365 (comment). (cherry picked from commit 027b203)
If the LocationSize is larger than the index space of the pointer type, bail out instead of triggering an APInt assertion. Fixes the issue reported at llvm/llvm-project#119365 (comment). (cherry picked from commit 027b203)
BasicAA currently tries to support addrspacecasts that change the index width by performing the decomposition in the maximum of all index widths and then trying to fix this up with in-place sign extends to get correct overflow behavior if the actual index width is smaller.
However, even in the case where we don't mix different index widths and just have an index width that is smaller than the maximum, the behavior is incorrect (see test), because we only perform the index width adjustment during decomposition and not any of the later logic -- and we don't do anything at all for variable offsets. I'm sure that the case where we actually mix different index widths is even more broken than that.
Fix this by not allowing decomposition through index width changes. If the pointers have different index widths, fall back to a base object comparison, ignoring the offsets.