codegen: Copy to an alloca when the argument is neither by-val nor by-move for indirect pointer.#155343
Conversation
| self.codegen_argument( | ||
| bx, | ||
| op, | ||
| // Pass by-move for an explicit tail call, which has been handled above as well. |
There was a problem hiding this comment.
This feels a bit awkward; perhaps the tail call handling can be moved into codegen_argument as well.
There was a problem hiding this comment.
The issue has nothing to do with tail calls, does it...?
There was a problem hiding this comment.
But this PR is changing something about tail calls...?
Maybe it's just the comment that is confusing me. Why is it okay to always to by-move for tail call arguments? Cc @folkertdev
There was a problem hiding this comment.
Because #151143 has done a similar thing, we don't need to copy it again. IIUC, tail call is special; the same arguments on the caller must be passed to the callee. Extra copy is UB.
There was a problem hiding this comment.
#151143 says something about
Therefore we store the argument for the callee in the corresponding caller's slot.
So I guess there is code elsewhere establishing an invariant that the argument here is the equivalent slot and we can pass a pointer to it. The comment should explain that and point at that code.
| // CHECK: call void @test_simd(<4 x i32> <i32 2, i32 4, i32 6, i32 8> | ||
| test_simd(const { Simd::<i32, 4>([2, 4, 6, 8]) }); | ||
|
|
||
| // CHECK: call void @test_simd_unaligned(%"minisimd::PackedSimd<i32, 3>" %1 |
There was a problem hiding this comment.
A redundant memcpy is skipped here.
5123815 to
9ec6be7
Compare
|
|
| self.codegen_argument( | ||
| bx, | ||
| location, | ||
| false, |
There was a problem hiding this comment.
| false, | |
| /* by_move */ false, |
|
This makes sense to me at a high level, but I don't know the surrounding code so reviewing the exact details of the conditionals introduced here would take me more time than I can spare right now -- sorry. I hope we can find another reviewer. maybe |
|
The PR description could use more words... @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
codegen: Copy to an alloca when the argument is neither by-val nor by-move for indirect pointer.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (6c71f51): comparison URL. Overall result: no relevant changes - no action neededBenchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf. @bors rollup=never Instruction countThis perf run didn't have relevant results for this metric. Max RSS (memory usage)Results (secondary -2.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 2.5%, secondary 2.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (secondary 0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 491.853s -> 492.621s (0.16%) |
Done. I also fixed the comment about tail calls. |
| //! Arguments passed indirectly via a hidden pointer must be copied to an alloca, | ||
| //! except for by-val or by-move. | ||
| //@ compile-flags: -Cno-prepopulate-passes -Copt-level=3 | ||
| //@ only-x86_64-unknown-linux-gnu |
There was a problem hiding this comment.
Can this be made a minicore test instead?
| // temporaries, then copy back to the caller's argument slots. | ||
| // Finally, we pass the caller's argument slots as arguments. | ||
| // | ||
| // To do that, the argument must be MUST-by-move value. |
There was a problem hiding this comment.
What does "MUST-by-move" mean, as opposed to just by-move?
There was a problem hiding this comment.
I mean codegen allows copying for by-move, but not for MUST-by-move.
…-move for indirect pointer.
|
@bors r+ |
This comment has been minimized.
This comment has been minimized.
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing cf1817b (parent) -> f676c20 (this PR) Test differencesShow 14 test diffsStage 1
Stage 2
Additionally, 4 doctest diffs were found. These are ignored, as they are noisy. Job group index
Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard f676c20edd32321e9fa2781543d8970109707e30 --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (f676c20): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis perf run didn't have relevant results for this metric. Max RSS (memory usage)This perf run didn't have relevant results for this metric. CyclesResults (primary 1.9%, secondary 2.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis perf run didn't have relevant results for this metric. Bootstrap: 491.458s -> 492.288s (0.17%) |
View all comments
Fixes #155241.
When a value is passed via an indirect pointer, the value needs to be copied to a new alloca. For x86_64-unknown-linux-gnu,
Thingis the case:Before passing the thing to the bar function, the thing needs to be copied to an alloca that is passed to bar.
This patch applies the rule to the untupled arguments as well.
For this case, this patch changes from
to
However, the same rule cannot be applied to tail calls that would be unsound, because the caller's stack frame is overwritten by the callee's stack frame. Fortunately, #151143 has already handled the special case. We must not copy again.
No copy is needed for by-move arguments, because the argument is passed to the called "in-place".
No copy is also needed for by-val arguments, because the attribute implies that a hidden copy of the pointee is made between the caller and the callee.
NOTE: The patch has a trick for tail calls that we pass by-move. We can choose to copy an alloca even for by-move arguments, but tail calls require MUST-by-move.