-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[InlineCost] Consider the default branch when calculating cost #77856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-llvm-analysis Author: Quentin Dian (DianQK) ChangesFirst step in fixing #76772. This PR considers the default branch as a case branch. This will give the unreachable default branch fair consideration. Patch is 23.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77856.diff 5 Files Affected:
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
index 60eab53fa2f601..1bf9cec8a2c40b 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -65,7 +65,9 @@ class TargetTransformInfoImplBase {
(void)PSI;
(void)BFI;
JTSize = 0;
- return SI.getNumCases();
+ bool HasDefault =
+ !isa<UnreachableInst>(SI.getDefaultDest()->getFirstNonPHIOrDbg());
+ return SI.getNumCases() + HasDefault;
}
unsigned getInliningThresholdMultiplier() const { return 1; }
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index 5e7bdcdf72a49f..45849869be4a65 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -446,6 +446,8 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
/// inline cost heuristic, but it's a generic cost model to be used in other
/// places (e.g., in loop unrolling).
unsigned N = SI.getNumCases();
+ bool HasDefault =
+ !isa<UnreachableInst>(SI.getDefaultDest()->getFirstNonPHIOrDbg());
const TargetLoweringBase *TLI = getTLI();
const DataLayout &DL = this->getDataLayout();
@@ -454,7 +456,7 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
// Early exit if both a jump table and bit test are not allowed.
if (N < 1 || (!IsJTAllowed && DL.getIndexSizeInBits(0u) < N))
- return N;
+ return N + HasDefault;
APInt MaxCaseVal = SI.case_begin()->getCaseValue()->getValue();
APInt MinCaseVal = MaxCaseVal;
@@ -474,23 +476,23 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
if (TLI->isSuitableForBitTests(Dests.size(), N, MinCaseVal, MaxCaseVal,
DL))
- return 1;
+ return 1 + HasDefault;
}
// Check if suitable for a jump table.
if (IsJTAllowed) {
if (N < 2 || N < TLI->getMinimumJumpTableEntries())
- return N;
+ return N + HasDefault;
uint64_t Range =
(MaxCaseVal - MinCaseVal)
.getLimitedValue(std::numeric_limits<uint64_t>::max() - 1) + 1;
// Check whether a range of clusters is dense enough for a jump table
if (TLI->isSuitableForJumpTable(&SI, N, Range, PSI, BFI)) {
JumpTableSize = Range;
- return 1;
+ return 1 + HasDefault;
}
}
- return N;
+ return N + HasDefault;
}
bool shouldBuildLookupTables() {
diff --git a/llvm/lib/Analysis/InlineCost.cpp b/llvm/lib/Analysis/InlineCost.cpp
index 1fa7badaa4fa01..7b9e6e02f7b093 100644
--- a/llvm/lib/Analysis/InlineCost.cpp
+++ b/llvm/lib/Analysis/InlineCost.cpp
@@ -707,8 +707,9 @@ class InlineCostCallAnalyzer final : public CallAnalyzer {
if (JumpTableSize) {
int64_t JTCost =
static_cast<int64_t>(JumpTableSize) * InstrCost + 4 * InstrCost;
-
addCost(JTCost);
+ if (NumCaseCluster > 1)
+ addCost((NumCaseCluster - 1) * 2 * InstrCost);
return;
}
@@ -1238,6 +1239,9 @@ class InlineCostFeaturesAnalyzer final : public CallAnalyzer {
int64_t JTCost = static_cast<int64_t>(JumpTableSize) * InstrCost +
JTCostMultiplier * InstrCost;
increment(InlineCostFeatureIndex::jump_table_penalty, JTCost);
+ if (NumCaseCluster > 1)
+ increment(InlineCostFeatureIndex::case_cluster_penalty,
+ (NumCaseCluster - 1) * CaseClusterCostMultiplier * InstrCost);
return;
}
diff --git a/llvm/test/Transforms/Inline/inline-switch-default-2.ll b/llvm/test/Transforms/Inline/inline-switch-default-2.ll
new file mode 100644
index 00000000000000..a784bd8d8613ab
--- /dev/null
+++ b/llvm/test/Transforms/Inline/inline-switch-default-2.ll
@@ -0,0 +1,317 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt %s -S -passes=inline -inline-threshold=21 | FileCheck %s
+
+; Check for scenarios without TTI.
+
+define i64 @foo1(i64 %0) {
+; LOOKUPTABLE-LABEL: define i64 @foo1(
+; LOOKUPTABLE-SAME: i64 [[TMP0:%.*]]) {
+; LOOKUPTABLE-NEXT: switch i64 [[TMP0]], label [[DEFAULT_BRANCH_I:%.*]] [
+; LOOKUPTABLE-NEXT: i64 0, label [[BRANCH_0_I:%.*]]
+; LOOKUPTABLE-NEXT: i64 2, label [[BRANCH_2_I:%.*]]
+; LOOKUPTABLE-NEXT: i64 4, label [[BRANCH_4_I:%.*]]
+; LOOKUPTABLE-NEXT: i64 6, label [[BRANCH_6_I:%.*]]
+; LOOKUPTABLE-NEXT: ]
+; LOOKUPTABLE: branch_0.i:
+; LOOKUPTABLE-NEXT: br label [[BAR1_EXIT:%.*]]
+; LOOKUPTABLE: branch_2.i:
+; LOOKUPTABLE-NEXT: br label [[BAR1_EXIT]]
+; LOOKUPTABLE: branch_4.i:
+; LOOKUPTABLE-NEXT: br label [[BAR1_EXIT]]
+; LOOKUPTABLE: branch_6.i:
+; LOOKUPTABLE-NEXT: br label [[BAR1_EXIT]]
+; LOOKUPTABLE: default_branch.i:
+; LOOKUPTABLE-NEXT: br label [[BAR1_EXIT]]
+; LOOKUPTABLE: bar1.exit:
+; LOOKUPTABLE-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0_I]] ], [ 9, [[BRANCH_2_I]] ], [ 2, [[BRANCH_4_I]] ], [ 7, [[BRANCH_6_I]] ], [ 3, [[DEFAULT_BRANCH_I]] ]
+; LOOKUPTABLE-NEXT: ret i64 [[TMP2]]
+;
+; SWITCH-LABEL: define i64 @foo1(
+; SWITCH-SAME: i64 [[TMP0:%.*]]) {
+; SWITCH-NEXT: switch i64 [[TMP0]], label [[DEFAULT_BRANCH_I:%.*]] [
+; SWITCH-NEXT: i64 0, label [[BRANCH_0_I:%.*]]
+; SWITCH-NEXT: i64 2, label [[BRANCH_2_I:%.*]]
+; SWITCH-NEXT: i64 4, label [[BRANCH_4_I:%.*]]
+; SWITCH-NEXT: i64 6, label [[BRANCH_6_I:%.*]]
+; SWITCH-NEXT: ]
+; SWITCH: branch_0.i:
+; SWITCH-NEXT: br label [[BAR1_EXIT:%.*]]
+; SWITCH: branch_2.i:
+; SWITCH-NEXT: br label [[BAR1_EXIT]]
+; SWITCH: branch_4.i:
+; SWITCH-NEXT: br label [[BAR1_EXIT]]
+; SWITCH: branch_6.i:
+; SWITCH-NEXT: br label [[BAR1_EXIT]]
+; SWITCH: default_branch.i:
+; SWITCH-NEXT: br label [[BAR1_EXIT]]
+; SWITCH: bar1.exit:
+; SWITCH-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0_I]] ], [ 9, [[BRANCH_2_I]] ], [ 2, [[BRANCH_4_I]] ], [ 7, [[BRANCH_6_I]] ], [ 3, [[DEFAULT_BRANCH_I]] ]
+; SWITCH-NEXT: ret i64 [[TMP2]]
+;
+; CHECK-LABEL: define i64 @foo1(
+; CHECK-SAME: i64 [[TMP0:%.*]]) {
+; CHECK-NEXT: [[TMP2:%.*]] = call i64 @bar1(i64 [[TMP0]])
+; CHECK-NEXT: ret i64 [[TMP2]]
+;
+ %2 = call i64 @bar1(i64 %0)
+ ret i64 %2
+}
+
+define i64 @foo2(i64 %0) {
+; LOOKUPTABLE-LABEL: define i64 @foo2(
+; LOOKUPTABLE-SAME: i64 [[TMP0:%.*]]) {
+; LOOKUPTABLE-NEXT: switch i64 [[TMP0]], label [[UNREACHABLEDEFAULT_I:%.*]] [
+; LOOKUPTABLE-NEXT: i64 0, label [[BRANCH_0_I:%.*]]
+; LOOKUPTABLE-NEXT: i64 2, label [[BRANCH_2_I:%.*]]
+; LOOKUPTABLE-NEXT: i64 4, label [[BRANCH_4_I:%.*]]
+; LOOKUPTABLE-NEXT: i64 6, label [[BRANCH_6_I:%.*]]
+; LOOKUPTABLE-NEXT: ]
+; LOOKUPTABLE: branch_0.i:
+; LOOKUPTABLE-NEXT: br label [[BAR2_EXIT:%.*]]
+; LOOKUPTABLE: branch_2.i:
+; LOOKUPTABLE-NEXT: br label [[BAR2_EXIT]]
+; LOOKUPTABLE: branch_4.i:
+; LOOKUPTABLE-NEXT: br label [[BAR2_EXIT]]
+; LOOKUPTABLE: branch_6.i:
+; LOOKUPTABLE-NEXT: br label [[BAR2_EXIT]]
+; LOOKUPTABLE: unreachabledefault.i:
+; LOOKUPTABLE-NEXT: unreachable
+; LOOKUPTABLE: bar2.exit:
+; LOOKUPTABLE-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0_I]] ], [ 9, [[BRANCH_2_I]] ], [ 2, [[BRANCH_4_I]] ], [ 7, [[BRANCH_6_I]] ]
+; LOOKUPTABLE-NEXT: ret i64 [[TMP2]]
+;
+; SWITCH-LABEL: define i64 @foo2(
+; SWITCH-SAME: i64 [[TMP0:%.*]]) {
+; SWITCH-NEXT: switch i64 [[TMP0]], label [[UNREACHABLEDEFAULT_I:%.*]] [
+; SWITCH-NEXT: i64 0, label [[BRANCH_0_I:%.*]]
+; SWITCH-NEXT: i64 2, label [[BRANCH_2_I:%.*]]
+; SWITCH-NEXT: i64 4, label [[BRANCH_4_I:%.*]]
+; SWITCH-NEXT: i64 6, label [[BRANCH_6_I:%.*]]
+; SWITCH-NEXT: ]
+; SWITCH: branch_0.i:
+; SWITCH-NEXT: br label [[BAR2_EXIT:%.*]]
+; SWITCH: branch_2.i:
+; SWITCH-NEXT: br label [[BAR2_EXIT]]
+; SWITCH: branch_4.i:
+; SWITCH-NEXT: br label [[BAR2_EXIT]]
+; SWITCH: branch_6.i:
+; SWITCH-NEXT: br label [[BAR2_EXIT]]
+; SWITCH: unreachabledefault.i:
+; SWITCH-NEXT: unreachable
+; SWITCH: bar2.exit:
+; SWITCH-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0_I]] ], [ 9, [[BRANCH_2_I]] ], [ 2, [[BRANCH_4_I]] ], [ 7, [[BRANCH_6_I]] ]
+; SWITCH-NEXT: ret i64 [[TMP2]]
+;
+; CHECK-LABEL: define i64 @foo2(
+; CHECK-SAME: i64 [[TMP0:%.*]]) {
+; CHECK-NEXT: switch i64 [[TMP0]], label [[UNREACHABLEDEFAULT_I:%.*]] [
+; CHECK-NEXT: i64 0, label [[BRANCH_0_I:%.*]]
+; CHECK-NEXT: i64 2, label [[BRANCH_2_I:%.*]]
+; CHECK-NEXT: i64 4, label [[BRANCH_4_I:%.*]]
+; CHECK-NEXT: i64 6, label [[BRANCH_6_I:%.*]]
+; CHECK-NEXT: ]
+; CHECK: branch_0.i:
+; CHECK-NEXT: br label [[BAR2_EXIT:%.*]]
+; CHECK: branch_2.i:
+; CHECK-NEXT: br label [[BAR2_EXIT]]
+; CHECK: branch_4.i:
+; CHECK-NEXT: br label [[BAR2_EXIT]]
+; CHECK: branch_6.i:
+; CHECK-NEXT: br label [[BAR2_EXIT]]
+; CHECK: unreachabledefault.i:
+; CHECK-NEXT: unreachable
+; CHECK: bar2.exit:
+; CHECK-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0_I]] ], [ 9, [[BRANCH_2_I]] ], [ 2, [[BRANCH_4_I]] ], [ 7, [[BRANCH_6_I]] ]
+; CHECK-NEXT: ret i64 [[TMP2]]
+;
+ %2 = call i64 @bar2(i64 %0)
+ ret i64 %2
+}
+
+define i64 @bar1(i64 %0) {
+; LOOKUPTABLE-LABEL: define i64 @bar1(
+; LOOKUPTABLE-SAME: i64 [[TMP0:%.*]]) {
+; LOOKUPTABLE-NEXT: switch i64 [[TMP0]], label [[DEFAULT_BRANCH:%.*]] [
+; LOOKUPTABLE-NEXT: i64 0, label [[BRANCH_0:%.*]]
+; LOOKUPTABLE-NEXT: i64 2, label [[BRANCH_2:%.*]]
+; LOOKUPTABLE-NEXT: i64 4, label [[BRANCH_4:%.*]]
+; LOOKUPTABLE-NEXT: i64 6, label [[BRANCH_6:%.*]]
+; LOOKUPTABLE-NEXT: ]
+; LOOKUPTABLE: branch_0:
+; LOOKUPTABLE-NEXT: br label [[EXIT:%.*]]
+; LOOKUPTABLE: branch_2:
+; LOOKUPTABLE-NEXT: br label [[EXIT]]
+; LOOKUPTABLE: branch_4:
+; LOOKUPTABLE-NEXT: br label [[EXIT]]
+; LOOKUPTABLE: branch_6:
+; LOOKUPTABLE-NEXT: br label [[EXIT]]
+; LOOKUPTABLE: default_branch:
+; LOOKUPTABLE-NEXT: br label [[EXIT]]
+; LOOKUPTABLE: exit:
+; LOOKUPTABLE-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0]] ], [ 9, [[BRANCH_2]] ], [ 2, [[BRANCH_4]] ], [ 7, [[BRANCH_6]] ], [ 3, [[DEFAULT_BRANCH]] ]
+; LOOKUPTABLE-NEXT: ret i64 [[TMP2]]
+;
+; SWITCH-LABEL: define i64 @bar1(
+; SWITCH-SAME: i64 [[TMP0:%.*]]) {
+; SWITCH-NEXT: switch i64 [[TMP0]], label [[DEFAULT_BRANCH:%.*]] [
+; SWITCH-NEXT: i64 0, label [[BRANCH_0:%.*]]
+; SWITCH-NEXT: i64 2, label [[BRANCH_2:%.*]]
+; SWITCH-NEXT: i64 4, label [[BRANCH_4:%.*]]
+; SWITCH-NEXT: i64 6, label [[BRANCH_6:%.*]]
+; SWITCH-NEXT: ]
+; SWITCH: branch_0:
+; SWITCH-NEXT: br label [[EXIT:%.*]]
+; SWITCH: branch_2:
+; SWITCH-NEXT: br label [[EXIT]]
+; SWITCH: branch_4:
+; SWITCH-NEXT: br label [[EXIT]]
+; SWITCH: branch_6:
+; SWITCH-NEXT: br label [[EXIT]]
+; SWITCH: default_branch:
+; SWITCH-NEXT: br label [[EXIT]]
+; SWITCH: exit:
+; SWITCH-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0]] ], [ 9, [[BRANCH_2]] ], [ 2, [[BRANCH_4]] ], [ 7, [[BRANCH_6]] ], [ 3, [[DEFAULT_BRANCH]] ]
+; SWITCH-NEXT: ret i64 [[TMP2]]
+;
+; CHECK-LABEL: define i64 @bar1(
+; CHECK-SAME: i64 [[TMP0:%.*]]) {
+; CHECK-NEXT: switch i64 [[TMP0]], label [[DEFAULT_BRANCH:%.*]] [
+; CHECK-NEXT: i64 0, label [[BRANCH_0:%.*]]
+; CHECK-NEXT: i64 2, label [[BRANCH_2:%.*]]
+; CHECK-NEXT: i64 4, label [[BRANCH_4:%.*]]
+; CHECK-NEXT: i64 6, label [[BRANCH_6:%.*]]
+; CHECK-NEXT: ]
+; CHECK: branch_0:
+; CHECK-NEXT: br label [[EXIT:%.*]]
+; CHECK: branch_2:
+; CHECK-NEXT: br label [[EXIT]]
+; CHECK: branch_4:
+; CHECK-NEXT: br label [[EXIT]]
+; CHECK: branch_6:
+; CHECK-NEXT: br label [[EXIT]]
+; CHECK: default_branch:
+; CHECK-NEXT: br label [[EXIT]]
+; CHECK: exit:
+; CHECK-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0]] ], [ 9, [[BRANCH_2]] ], [ 2, [[BRANCH_4]] ], [ 7, [[BRANCH_6]] ], [ 3, [[DEFAULT_BRANCH]] ]
+; CHECK-NEXT: ret i64 [[TMP2]]
+;
+ switch i64 %0, label %default_branch [
+ i64 0, label %branch_0
+ i64 2, label %branch_2
+ i64 4, label %branch_4
+ i64 6, label %branch_6
+ ]
+
+branch_0:
+ br label %exit
+
+branch_2:
+ br label %exit
+
+branch_4:
+ br label %exit
+
+branch_6:
+ br label %exit
+
+default_branch:
+ br label %exit
+
+exit:
+ %2 = phi i64 [ 5, %branch_0 ], [ 9, %branch_2 ], [ 2, %branch_4 ], [ 7, %branch_6 ], [ 3, %default_branch ]
+ ret i64 %2
+}
+
+define i64 @bar2(i64 %0) {
+; LOOKUPTABLE-LABEL: define i64 @bar2(
+; LOOKUPTABLE-SAME: i64 [[TMP0:%.*]]) {
+; LOOKUPTABLE-NEXT: switch i64 [[TMP0]], label [[UNREACHABLEDEFAULT:%.*]] [
+; LOOKUPTABLE-NEXT: i64 0, label [[BRANCH_0:%.*]]
+; LOOKUPTABLE-NEXT: i64 2, label [[BRANCH_2:%.*]]
+; LOOKUPTABLE-NEXT: i64 4, label [[BRANCH_4:%.*]]
+; LOOKUPTABLE-NEXT: i64 6, label [[BRANCH_6:%.*]]
+; LOOKUPTABLE-NEXT: ]
+; LOOKUPTABLE: branch_0:
+; LOOKUPTABLE-NEXT: br label [[EXIT:%.*]]
+; LOOKUPTABLE: branch_2:
+; LOOKUPTABLE-NEXT: br label [[EXIT]]
+; LOOKUPTABLE: branch_4:
+; LOOKUPTABLE-NEXT: br label [[EXIT]]
+; LOOKUPTABLE: branch_6:
+; LOOKUPTABLE-NEXT: br label [[EXIT]]
+; LOOKUPTABLE: unreachabledefault:
+; LOOKUPTABLE-NEXT: unreachable
+; LOOKUPTABLE: exit:
+; LOOKUPTABLE-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0]] ], [ 9, [[BRANCH_2]] ], [ 2, [[BRANCH_4]] ], [ 7, [[BRANCH_6]] ]
+; LOOKUPTABLE-NEXT: ret i64 [[TMP2]]
+;
+; SWITCH-LABEL: define i64 @bar2(
+; SWITCH-SAME: i64 [[TMP0:%.*]]) {
+; SWITCH-NEXT: switch i64 [[TMP0]], label [[UNREACHABLEDEFAULT:%.*]] [
+; SWITCH-NEXT: i64 0, label [[BRANCH_0:%.*]]
+; SWITCH-NEXT: i64 2, label [[BRANCH_2:%.*]]
+; SWITCH-NEXT: i64 4, label [[BRANCH_4:%.*]]
+; SWITCH-NEXT: i64 6, label [[BRANCH_6:%.*]]
+; SWITCH-NEXT: ]
+; SWITCH: branch_0:
+; SWITCH-NEXT: br label [[EXIT:%.*]]
+; SWITCH: branch_2:
+; SWITCH-NEXT: br label [[EXIT]]
+; SWITCH: branch_4:
+; SWITCH-NEXT: br label [[EXIT]]
+; SWITCH: branch_6:
+; SWITCH-NEXT: br label [[EXIT]]
+; SWITCH: unreachabledefault:
+; SWITCH-NEXT: unreachable
+; SWITCH: exit:
+; SWITCH-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0]] ], [ 9, [[BRANCH_2]] ], [ 2, [[BRANCH_4]] ], [ 7, [[BRANCH_6]] ]
+; SWITCH-NEXT: ret i64 [[TMP2]]
+;
+; CHECK-LABEL: define i64 @bar2(
+; CHECK-SAME: i64 [[TMP0:%.*]]) {
+; CHECK-NEXT: switch i64 [[TMP0]], label [[UNREACHABLEDEFAULT:%.*]] [
+; CHECK-NEXT: i64 0, label [[BRANCH_0:%.*]]
+; CHECK-NEXT: i64 2, label [[BRANCH_2:%.*]]
+; CHECK-NEXT: i64 4, label [[BRANCH_4:%.*]]
+; CHECK-NEXT: i64 6, label [[BRANCH_6:%.*]]
+; CHECK-NEXT: ]
+; CHECK: branch_0:
+; CHECK-NEXT: br label [[EXIT:%.*]]
+; CHECK: branch_2:
+; CHECK-NEXT: br label [[EXIT]]
+; CHECK: branch_4:
+; CHECK-NEXT: br label [[EXIT]]
+; CHECK: branch_6:
+; CHECK-NEXT: br label [[EXIT]]
+; CHECK: unreachabledefault:
+; CHECK-NEXT: unreachable
+; CHECK: exit:
+; CHECK-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0]] ], [ 9, [[BRANCH_2]] ], [ 2, [[BRANCH_4]] ], [ 7, [[BRANCH_6]] ]
+; CHECK-NEXT: ret i64 [[TMP2]]
+;
+ switch i64 %0, label %unreachabledefault [
+ i64 0, label %branch_0
+ i64 2, label %branch_2
+ i64 4, label %branch_4
+ i64 6, label %branch_6
+ ]
+
+branch_0:
+ br label %exit
+
+branch_2:
+ br label %exit
+
+branch_4:
+ br label %exit
+
+branch_6:
+ br label %exit
+
+unreachabledefault:
+ unreachable
+
+exit:
+ %2 = phi i64 [ 5, %branch_0 ], [ 9, %branch_2 ], [ 2, %branch_4 ], [ 7, %branch_6 ]
+ ret i64 %2
+}
diff --git a/llvm/test/Transforms/Inline/inline-switch-default.ll b/llvm/test/Transforms/Inline/inline-switch-default.ll
new file mode 100644
index 00000000000000..6271d6c06dcc6f
--- /dev/null
+++ b/llvm/test/Transforms/Inline/inline-switch-default.ll
@@ -0,0 +1,216 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt %s -S -passes=inline -inline-threshold=26 -min-jump-table-entries=4 | FileCheck %s -check-prefix=LOOKUPTABLE
+; RUN: opt %s -S -passes=inline -inline-threshold=21 -min-jump-table-entries=5 | FileCheck %s -check-prefix=SWITCH
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; The `bar1` should not be inlined since there is a default branch.
+
+define i64 @foo1(i64 %0) {
+; LOOKUPTABLE-LABEL: define i64 @foo1(
+; LOOKUPTABLE-SAME: i64 [[TMP0:%.*]]) {
+; LOOKUPTABLE-NEXT: [[TMP2:%.*]] = call i64 @bar1(i64 [[TMP0]])
+; LOOKUPTABLE-NEXT: ret i64 [[TMP2]]
+;
+; SWITCH-LABEL: define i64 @foo1(
+; SWITCH-SAME: i64 [[TMP0:%.*]]) {
+; SWITCH-NEXT: [[TMP2:%.*]] = call i64 @bar1(i64 [[TMP0]])
+; SWITCH-NEXT: ret i64 [[TMP2]]
+;
+ %2 = call i64 @bar1(i64 %0)
+ ret i64 %2
+}
+
+define i64 @foo2(i64 %0) {
+; LOOKUPTABLE-LABEL: define i64 @foo2(
+; LOOKUPTABLE-SAME: i64 [[TMP0:%.*]]) {
+; LOOKUPTABLE-NEXT: switch i64 [[TMP0]], label [[UNREACHABLEDEFAULT_I:%.*]] [
+; LOOKUPTABLE-NEXT: i64 0, label [[BRANCH_0_I:%.*]]
+; LOOKUPTABLE-NEXT: i64 2, label [[BRANCH_2_I:%.*]]
+; LOOKUPTABLE-NEXT: i64 4, label [[BRANCH_4_I:%.*]]
+; LOOKUPTABLE-NEXT: i64 6, label [[BRANCH_6_I:%.*]]
+; LOOKUPTABLE-NEXT: ]
+; LOOKUPTABLE: branch_0.i:
+; LOOKUPTABLE-NEXT: br label [[BAR2_EXIT:%.*]]
+; LOOKUPTABLE: branch_2.i:
+; LOOKUPTABLE-NEXT: br label [[BAR2_EXIT]]
+; LOOKUPTABLE: branch_4.i:
+; LOOKUPTABLE-NEXT: br label [[BAR2_EXIT]]
+; LOOKUPTABLE: branch_6.i:
+; LOOKUPTABLE-NEXT: br label [[BAR2_EXIT]]
+; LOOKUPTABLE: unreachabledefault.i:
+; LOOKUPTABLE-NEXT: unreachable
+; LOOKUPTABLE: bar2.exit:
+; LOOKUPTABLE-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0_I]] ], [ 9, [[BRANCH_2_I]] ], [ 2, [[BRANCH_4_I]] ], [ 7, [[BRANCH_6_I]] ]
+; LOOKUPTABLE-NEXT: ret i64 [[TMP2]]
+;
+; SWITCH-LABEL: define i64 @foo2(
+; SWITCH-SAME: i64 [[TMP0:%.*]]) {
+; SWITCH-NEXT: switch i64 [[TMP0]], label [[UNREACHABLEDEFAULT_I:%.*]] [
+; SWITCH-NEXT: i64 0, label [[BRANCH_0_I:%.*]]
+; SWITCH-NEXT: i64 2, label [[BRANCH_2_I:%.*]]
+; SWITCH-NEXT: i64 4, label [[BRANCH_4_I:%.*]]
+; SWITCH-NEXT: i64 6, label [[BRANCH_6_I:%.*]]
+; SWITCH-NEXT: ]
+; SWITCH: branch_0.i:
+; SWITCH-NEXT: br label [[BAR2_EXIT:%.*]]
+; SWITCH: branch_2.i:
+; SWITCH-NEXT: br label [[BAR2_EXIT]]
+; SWITCH: branch_4.i:
+; SWITCH-NEXT: br label [[BAR2_EXIT]]
+; SWITCH: branch_6.i:
+; SWITCH-NEXT: br label [[BAR2_EXIT]]
+; SWITCH: unreachabledefault.i:
+; SWITCH-NEXT: unreachable
+; SWITCH: bar2.exit:
+; SWITCH-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0_I]] ], [ 9, [[BRANCH_2_I]] ], [ 2, [[BRANCH_4_I]] ], [ 7, [[BRANCH_6_I]] ]
+; SWITCH-NEXT: ret i64 [[TMP2]]
+;
+ %2 = call i64 @bar2(i64 %0)
+ ret i64 %2
+}
+
+define i64 @bar1(i64 %0) {
+; LOOKUPTABLE-LABEL: define i64 @bar1(
+; LOOKUPTABLE-SAME: i64 [[TMP0:%.*]]) {
+; LOOKUPTABLE-NEXT: switch i64 [[TMP0]], label [[DEFAULT_BRANCH:%.*]] [
+; LOOKUPTABLE-NEXT: i64 0, label [[BRANCH_0:%.*]]
+; LOOKUPTABLE-NEXT: i64 2, label [[BRANCH_2:%.*]]
+; LOOKUPTABLE-NEXT: i64 4, label [[BRANCH_4:%.*]]
+; LOOKUPTABLE-NEXT: i64 6, label [[BRANCH_6:%.*]]
+; LOOKUPTABLE-NEXT: ]
+; LOOKUPTABLE: branch_0:
+; LOOKUPTABLE-NEXT: br label [[EXIT:%.*]]
+; LOOKUPTABLE: branch_2:
+; LOOKUPTABLE-NEXT: br label [[EXIT]]
+; LOOKUPTABLE: branch_4:
+; LOOKUPTABLE-NEXT: br label [[EXIT]]
+; LOOKUPTABLE: branch_6:
+; LOOKUPTABLE-NEXT: br label [[EXIT]]
+; LOOKUPTABLE: default_branch:
+; LOOKUPTABLE-NEXT: br label [[EXIT]]
+; LOOKUPTABLE: exit:
+; LOOKUPTABLE-NEXT: [[TMP2:%.*]] = phi i64 [ 5, [[BRANCH_0]] ], [ 9, [[BRANCH_2]] ], [ 2, [[BRANCH_4]] ], [ 7, [[BRANCH_6]] ], [ 3, [[DEFAULT_BRANCH]] ]
+; LOOKUPTABLE-NEXT: ret i64 [[TMP2]]
+;
+; SWITCH-LABEL: define i64 @bar1(
+; SWITCH-SAME: i64 [[TMP0:%.*]]) {
+; SWITCH-NEXT: switch i64 [[TMP0]], label [[DEFAULT_BRANCH:%.*]] [
+; SWITCH-NEXT: i64 0, label [[BRANCH_0:%.*]]
+; SWITCH-NEX...
[truncated]
|
Add `defaultDestIsUnreachable`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm except for the naming documentation of the helper
llvm/include/llvm/IR/Instructions.h
Outdated
@@ -3505,6 +3506,10 @@ class SwitchInst : public Instruction { | |||
return cast<BasicBlock>(getOperand(1)); | |||
} | |||
|
|||
bool defaultDestIsUnreachable() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add documentation comment. I think the name is misleading too. The code isn't unreachable, and it only accepts empty blocks. How about something like defaultDestIsEmpty? defaultDestUndefined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reminder. I prefer defaultDestUndefined
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emm, looks like a lot of callees cannot be inlined after this patch. |
Indeed, this is different from what was initially assumed. But from the code, I think there is a real omission of default branch handling here. I was just considering adding a bias experiment, but this would cause the consumption of small no-default branches to go negative. |
@@ -1153,6 +1154,7 @@ class InlineCostFeaturesAnalyzer final : public CallAnalyzer { | |||
// heuristics in the ML inliner. | |||
static constexpr int JTCostMultiplier = 4; | |||
static constexpr int CaseClusterCostMultiplier = 2; | |||
static constexpr int SwitchDefaultDestCostMultiplier = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defined this constant but hardcoded 2 above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copied it from CaseClusterCostMultiplier
.
@dtcxzyw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
FWIW, it seems like this causes a minor performance regression in stage2 clang builds (https://llvm-compile-time-tracker.com/compare.php?from=5aec9392674572fa5a06283173a6a739742d261d&to=5932fcc47855fdd209784f38820422d2369b84b2&stat=instructions:u). Though code size does reduce as well. |
Do you think we need to revert? If not, I can create an issue to track it. |
Can we slightly increase the inline threshold to fix the regression? |
We need to understand what is no longer getting inlined and see whether there is either some optimization of cost modelling improvement we can make to compensate. (E.g. I think doing switch-to-arithmetic conversion pre-inline would be worthwhile.) |
I will consider trying it.
Yes, the example makes sense. I have created an issue for this and I will try to investigate at the end of the traveling. |
Someone just reported to me internally that this caused a benchmark regression in our downstream. I wonder if the default case was originally missing because it should just be a fallthrough after all the other cases are picked off. It shouldn't have any code. Given N destinations for a switch you only need N-1 compares and branches. Maybe an unreachable default should reduce the number of case clusters since one case cluster check wouldn't be needed? Though I'm not sure exactly what the backend generates. |
I have created #81723 to track some regressions. I haven't started my research yet because this case is too big.
I believe that we will create extra instructions if the default branch is reachable. See https://llvm.godbolt.org/z/x7azoP7eY. bar1: # @bar1
cmp rdi, 6
ja .LBB0_6
... This regression is likely coming from the select lookup vs compare instructions if I guess it right. See https://llvm.godbolt.org/z/cYdoef4cT. I will prioritize the issue because it's becoming important. :) I'll report back my progress this weekend. |
You're right that cmp/ja does come from the default when we use jump table lowering. There is
|
…#85160) Fixes #81723. The earliest commit of the related code is: 919f9e8. I tried to understand the following code with #77856 (comment). https://github.com/llvm/llvm-project/blob/5932fcc47855fdd209784f38820422d2369b84b2/llvm/lib/Analysis/InlineCost.cpp#L709-L720 I think only scenarios where there is a default branch were considered.
First step in fixing #76772.
This PR considers the default branch as a case branch. This will give the unreachable default branch fair consideration.
I changed
getEstimatedNumberOfCaseClusters
, although that didn't affect the rest. Maybe a better approach would be to add a new method?