-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Optimize DroplessArena arena allocation #108693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
r? @cjgillot (rustbot has picked a reviewer for you, use r? to override) |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit a3fc36e383c59629274d74b6ac103dc90e96660c with merge 96f98934793755c7ae42a2e0a9ccc49289135001... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (96f98934793755c7ae42a2e0a9ccc49289135001): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Warning ⚠: The following benchmark(s) failed to build:
Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. |
Any idea what broke the benchmark? |
No. It kind of looks like a problem with |
You could give this another perf run to see if it reproduces. I also added another optimization bringing instructions for allocating an |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 3e34eca9a4f22f4277c78d7aac5ea72ed5156724 with merge 48a8ae2628df7f3ccd0a365bd173203710f305ac... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (48a8ae2628df7f3ccd0a365bd173203710f305ac): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Warning ⚠: The following benchmark(s) failed to build:
Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
This could use another perf run to see if more aggressive inlining helps recover the bootstrap improvement. |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit deef40780126946106eaed6734d680cc6489798d with merge 6a923fd31c77214782a0cea705ad8fd9b5b4204f... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (6a923fd31c77214782a0cea705ad8fd9b5b4204f): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 647.057s -> 645.932s (-0.17%) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments. Did you reach a decision on https://github.com/rust-lang/rust/pull/108693/files#r1125632892?
compiler/rustc_arena/src/lib.rs
Outdated
} | ||
|
||
#[inline(always)] | ||
fn align(val: usize, align: usize) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn align(val: usize, align: usize) -> usize { | |
fn align_up(val: usize, align: usize) -> usize { |
For symmetry with align_down
.
compiler/rustc_arena/src/lib.rs
Outdated
// Align the end to DROPLESS_ALIGNMENT | ||
let end = align_down(chunk.end().addr(), DROPLESS_ALIGNMENT); | ||
// Make sure we don't go past `start` | ||
let end = cmp::max(chunk.start().addr(), end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it even possible to go past start
? Should this be a debug_assert instead?
What if start
is unsufficiently aligned?
#[inline(never)] | ||
#[cold] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? It just calls another inline(never)
method with a constant argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It keeps the passing of the constant argument out of the hot path.
|
||
let new_end = end.checked_sub(bytes)? & !(align - 1); | ||
let new_end = align_down(end.checked_sub(bytes)?, layout.align()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a line comment explaining why new_end
is at least aligned on DROPLESS_ALIGNMENT
?
@rustbot ready |
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (07438b0): comparison URL. Overall result: ❌✅ regressions and improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 635.158s -> 635.21s (0.01%) |
This optimizes
DroplessArena
allocation by always ensuring that it is aligned tousize
and addinggrow_and_alloc
andgrow_and_alloc_raw
functions which both grow and allocate, reducing code size.