Optimize DroplessArena arena allocation #108693

Zoxc · 2023-03-03T06:07:37Z

This optimizes DroplessArena allocation by always ensuring that it is aligned to usize and adding grow_and_alloc and grow_and_alloc_rawfunctions which both grow and allocate, reducing code size.

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 clap:check	1.6968s	1.6887s	-0.48%
🟣 hyper:check	0.2552s	0.2551s	-0.03%
🟣 regex:check	0.9613s	0.9553s	-0.62%
🟣 syn:check	1.5402s	1.5374s	-0.18%
🟣 syntex_syntax:check	5.9175s	5.8813s	-0.61%
Total	10.3710s	10.3178s	-0.51%
Summary	1.0000s	0.9962s	-0.38%

rustbot · 2023-03-03T06:07:43Z

r? @cjgillot

(rustbot has picked a reviewer for you, use r? to override)

matthiaskrgr · 2023-03-03T06:09:52Z

@bors try @rust-timer queue

bors · 2023-03-03T06:10:01Z

⌛ Trying commit a3fc36e383c59629274d74b6ac103dc90e96660c with merge 96f98934793755c7ae42a2e0a9ccc49289135001...

bors · 2023-03-03T08:23:56Z

☀️ Try build successful - checks-actions
Build commit: 96f98934793755c7ae42a2e0a9ccc49289135001 (96f98934793755c7ae42a2e0a9ccc49289135001)

rust-timer · 2023-03-03T10:29:29Z

Finished benchmarking commit (96f98934793755c7ae42a2e0a9ccc49289135001): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Warning ⚠: The following benchmark(s) failed to build:

webrender-2022

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.4%	[1.4%, 1.4%]	1
Improvements ✅ (primary)	-0.6%	[-0.7%, -0.6%]	3
Improvements ✅ (secondary)	-0.5%	[-1.0%, -0.2%]	6
All ❌✅ (primary)	-0.6%	[-0.7%, -0.6%]	3

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.3%	[0.9%, 7.6%]	4
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.8%	[-3.8%, -1.4%]	11
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

cjgillot · 2023-03-03T10:50:11Z

Any idea what broke the benchmark?

Zoxc · 2023-03-03T10:55:51Z

No. It kind of looks like a problem with rustc-perf. Maybe it's non-deterministic?

Zoxc · 2023-03-03T12:52:04Z

You could give this another perf run to see if it reproduces. I also added another optimization bringing instructions for allocating an usize from 8 down to 6.

cjgillot · 2023-03-03T22:02:57Z

@bors try @rust-timer queue

bors · 2023-03-03T22:03:06Z

⌛ Trying commit 3e34eca9a4f22f4277c78d7aac5ea72ed5156724 with merge 48a8ae2628df7f3ccd0a365bd173203710f305ac...

bors · 2023-03-04T00:18:33Z

☀️ Try build successful - checks-actions
Build commit: 48a8ae2628df7f3ccd0a365bd173203710f305ac (48a8ae2628df7f3ccd0a365bd173203710f305ac)

rust-timer · 2023-03-04T01:35:33Z

Finished benchmarking commit (48a8ae2628df7f3ccd0a365bd173203710f305ac): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Warning ⚠: The following benchmark(s) failed to build:

webrender-2022

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.2%, 0.3%]	9
Regressions ❌ (secondary)	0.9%	[0.2%, 1.5%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.2%	[0.2%, 0.3%]	9

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	8.0%	[8.0%, 8.0%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.8%	[-4.0%, -1.1%]	11
All ❌✅ (primary)	-	-	0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.2%	[2.2%, 2.2%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Zoxc · 2023-03-04T04:11:35Z

This could use another perf run to see if more aggressive inlining helps recover the bootstrap improvement.

cjgillot · 2023-03-04T09:44:49Z

@bors try @rust-timer queue

bors · 2023-05-28T14:19:14Z

⌛ Trying commit deef40780126946106eaed6734d680cc6489798d with merge 6a923fd31c77214782a0cea705ad8fd9b5b4204f...

bors · 2023-05-28T16:31:47Z

☀️ Try build successful - checks-actions
Build commit: 6a923fd31c77214782a0cea705ad8fd9b5b4204f (6a923fd31c77214782a0cea705ad8fd9b5b4204f)

rust-timer · 2023-05-28T18:56:01Z

Finished benchmarking commit (6a923fd31c77214782a0cea705ad8fd9b5b4204f): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.5%	[1.4%, 4.5%]	16
Improvements ✅ (primary)	-2.0%	[-2.9%, -1.0%]	2
Improvements ✅ (secondary)	-3.6%	[-4.6%, -2.8%]	5
All ❌✅ (primary)	-2.0%	[-2.9%, -1.0%]	2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.0%	[-3.0%, -3.0%]	1
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 647.057s -> 645.932s (-0.17%)

cjgillot

A few comments. Did you reach a decision on https://github.com/rust-lang/rust/pull/108693/files#r1125632892?

cjgillot · 2023-05-29T15:37:06Z

compiler/rustc_arena/src/lib.rs

+}
+
+#[inline(always)]
+fn align(val: usize, align: usize) -> usize {


Suggested change

fn align(val: usize, align: usize) -> usize {

fn align_up(val: usize, align: usize) -> usize {

For symmetry with align_down.

compiler/rustc_arena/src/lib.rs

cjgillot · 2023-05-29T15:40:10Z

compiler/rustc_arena/src/lib.rs

+            // Align the end to DROPLESS_ALIGNMENT
+            let end = align_down(chunk.end().addr(), DROPLESS_ALIGNMENT);
+            // Make sure we don't go past `start`
+            let end = cmp::max(chunk.start().addr(), end);


Is it even possible to go past start? Should this be a debug_assert instead?
What if start is unsufficiently aligned?

cjgillot · 2023-05-29T15:41:27Z

compiler/rustc_arena/src/lib.rs

+    #[inline(never)]
+    #[cold]


Why? It just calls another inline(never) method with a constant argument.

It keeps the passing of the constant argument out of the hot path.

cjgillot · 2023-05-29T15:45:09Z

compiler/rustc_arena/src/lib.rs


-        let new_end = end.checked_sub(bytes)? & !(align - 1);
+        let new_end = align_down(end.checked_sub(bytes)?, layout.align());


Could you add a line comment explaining why new_end is at least aligned on DROPLESS_ALIGNMENT?

Zoxc · 2023-08-14T20:03:51Z

@rustbot ready

cjgillot · 2023-08-16T19:52:34Z

@bors r+

bors · 2023-08-16T19:52:36Z

📌 Commit 6f86591 has been approved by cjgillot

It is now in the queue for this repository.

bors · 2023-08-16T21:37:16Z

⌛ Testing commit 6f86591 with merge 07438b0...

bors · 2023-08-16T23:21:09Z

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing 07438b0 to master...

rust-timer · 2023-08-17T01:57:06Z

Finished benchmarking commit (07438b0): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.4%	[1.4%, 1.4%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.3%	[-0.3%, -0.3%]	1
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.7%	[0.8%, 5.9%]	9
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-4.8%	[-12.1%, -1.2%]	7
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 635.158s -> 635.21s (0.01%)
Artifact size: 346.86 MiB -> 346.72 MiB (-0.04%)

rustbot assigned cjgillot Mar 3, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 3, 2023