Improve caching by introducing TypingMode::ErasedNotCoherence#155443
Improve caching by introducing TypingMode::ErasedNotCoherence#155443rust-bors[bot] merged 15 commits intorust-lang:mainfrom
TypingMode::ErasedNotCoherence#155443Conversation
|
@bors try |
This comment has been minimized.
This comment has been minimized.
|
@bors try parent=14196dbfa3eb7c30195251eac092b1b86c8a2d84 |
This comment has been minimized.
This comment has been minimized.
[DO NOT MERGE] Improve canonicalization performance
|
@rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (db1df07): comparison URL. Overall result: ❌✅ regressions and improvements - please read:Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf. Next, please: If you can, justify the regressions found in this try perf run in writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.9%, secondary 1.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -2.4%, secondary 28.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis perf run didn't have relevant results for this metric. Bootstrap: 491.114s -> 492.029s (0.19%) |
|
changes to the core type system cc @lcnr Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt Some changes occurred to the CTFE machinery Some changes occurred to the core trait solver cc @rust-lang/initiative-trait-system-refactor Some changes occurred to the CTFE / Miri interpreter cc @rust-lang/miri |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
TypingMode::ErasedNotCoherence
This comment has been minimized.
This comment has been minimized.
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
@bors p=1 rollup=never (has been quite bitrotty, also perf) |
|
@bors r=lcnr |
|
@bors p=6 |
This comment has been minimized.
This comment has been minimized.
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing ba1a955 (parent) -> 365c0e1 (this PR) Test differencesShow 72 test diffsStage 1
Stage 2
Additionally, 70 doctest diffs were found. These are ignored, as they are noisy. Job group index
Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 365c0e1d7a614ca94cb48431dcd2bc6d3b645db1 --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (365c0e1): comparison URL. Overall result: ❌✅ regressions and improvements - please read:Our benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 0.8%, secondary 2.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 2.0%, secondary -2.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis perf run didn't have relevant results for this metric. Bootstrap: 496.315s -> 501.88s (1.12%) |
View all comments
r? @lcnr
This introduces
TypingMode::ErasedNotCoherence. Most typing modes contain a list of opaque types, which are quite often unused during canonicalization. With this change, any time we try canonicalization, we replace whichever typing mode we're currently in withErasedNotcoherence, attempt to canonicalize, and if that fails retry in the original typing mode. If erased mode succeeds, this is beneficial because that way the opaque types don't end up in the cache key, allowing more cache reuse.This seems to have a small (0.5%) slowdown on most programs, but a dramatic (>60%) speedup in specific cases like the rustc-perf
wg-grammarbenchmark. Some more improvements are expected with "eager normalization", which is work that's under way right now.