Skip to content

Computing crate_hash from metadata encoding instead of HIR (implements #94878) #154724

Open
Daniel-B-Smith wants to merge 1 commit into
rust-lang:mainfrom
Daniel-B-Smith:smithdb3/fix-94878
Open

Computing crate_hash from metadata encoding instead of HIR (implements #94878) #154724
Daniel-B-Smith wants to merge 1 commit into
rust-lang:mainfrom
Daniel-B-Smith:smithdb3/fix-94878

Conversation

@Daniel-B-Smith

@Daniel-B-Smith Daniel-B-Smith commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

View all comments

This PR converts the crate_hash/SVH to depend on metadata instead of HIR whenever metadata is needed. A trimmed down crate_hash is kept for dylib/binary cases where metadata is not present. It is believed that the metadata will be a more sound way to track dependency changes.

Related to #94878

The metadata hash is calculated on the raw bytes before flush because hashing the elements incrementally turned out to be too expensive.

The change to the HIR hash is potentially safe even without the metadata crate_hash change. Without that change, this PR is a performance regression. I am bundling them because I believe the HIR hash change is more likely to be safe in light of the HIR hash being less load bearing on the SVH.

The dylib/binary crate_hash removes components that I believe are not relevant to those cases. Analysis:

  • resolutions.visibilities_for_hashing:

The comment says: "Hash visibility information since it does not appear in HIR." Visibilities only matter to external users of the crate (i.e. consumers of metadata). Without metadata, visibility differences are not observable, so this contributes nothing for incremental correctness.

  • debugger_visualizers:

The comment says: "that content is exported into crate metadata, so any changes to it need to be reflected in the crate hash." Without metadata, there is nothing to export, so this is purely metadata-motivated. The visualizer file path is already covered by HIR (the attribute) and the content is loaded later only if metadata is being written.

  • source_file_names:

These exist solely so that the crate-hash reflects remapped source paths that get embedded into metadata (the comment explicitly says "If we included the full mapping in the SVH, we could only have reproducible builds…"). The remapping itself is already captured via dep_tracking_hash, so for incremental this is redundant.

The crate hash calculation can be reverted via -Z metadata-crate-hash=no. I manually confirmed that setting that flag generates the same hash as the commit prior to my change. The test for that flag only checks that it changes the value of the hash and does not check the specific value of the hash. I could add a check for the latter, but I'm nervous that a hash golden test would be a maintenance headache.

DO NOT SUBMIT: Incremental compilation testing is ongoing and a way to revert the hash is necessary.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 2, 2026
@rust-log-analyzer

This comment has been minimized.

@Daniel-B-Smith Daniel-B-Smith force-pushed the smithdb3/fix-94878 branch 5 times, most recently from 24c04d5 to c826667 Compare April 7, 2026 15:59
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@Daniel-B-Smith Daniel-B-Smith force-pushed the smithdb3/fix-94878 branch 2 times, most recently from 43a7704 to 338aba3 Compare April 8, 2026 19:34
@rust-log-analyzer

This comment has been minimized.

@Daniel-B-Smith Daniel-B-Smith force-pushed the smithdb3/fix-94878 branch 2 times, most recently from 0ddcd96 to d27cca5 Compare April 8, 2026 21:34
@rust-log-analyzer

This comment has been minimized.

@Daniel-B-Smith Daniel-B-Smith force-pushed the smithdb3/fix-94878 branch 3 times, most recently from 1e5f269 to 4171895 Compare April 9, 2026 14:56
@rust-log-analyzer

This comment has been minimized.

@rust-bors

This comment has been minimized.

@Daniel-B-Smith Daniel-B-Smith changed the title #94878 Computing crate_hash from metadata encoding instead of HIR (implements #94878) (very draft) Apr 9, 2026
@nnethercote

Copy link
Copy Markdown
Contributor

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 9, 2026
@rust-log-analyzer

This comment has been minimized.

@petrochenkov

petrochenkov commented May 26, 2026

Copy link
Copy Markdown
Contributor

If this needs an extra pair of eyes, feel free to assign me after cjgillot reviews.
I'm generally interested in this topic and RDR, just don't have time for doing full initial reviews at the moment.

@petrochenkov

Copy link
Copy Markdown
Contributor

Is there any ordering dependency between this PR and #155871?
Should this PR be merged first, or it doesn't matter?

@susitsm

susitsm commented May 26, 2026

Copy link
Copy Markdown
Contributor

These are independent. The one getting in later will have to do a bit of merge conflict resolution, but nothing too bad.

@rust-bors

This comment has been minimized.

@rustbot

This comment has been minimized.

Comment thread compiler/rustc_metadata/src/rmeta/opaque.rs Outdated
Comment thread compiler/rustc_metadata/src/rmeta/opaque.rs Outdated
@rustbot

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@Kobzol

Kobzol commented Jun 5, 2026

Copy link
Copy Markdown
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

@rust-bors

rust-bors Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

☀️ Try build successful (CI)
Build commit: f0dc837 (f0dc837ab0fd17f0782df57d842c84dc7f57a4a0, parent: e7815e522ecc746592fee32f50478f521333b503)

@rust-timer

This comment has been minimized.

@rust-timer

Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (f0dc837): comparison URL.

Overall result: ❌✅ regressions and improvements - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.4% [0.4%, 0.4%] 1
Regressions ❌
(secondary)
0.1% [0.1%, 0.1%] 5
Improvements ✅
(primary)
-0.4% [-0.9%, -0.1%] 92
Improvements ✅
(secondary)
-1.3% [-5.4%, -0.1%] 53
All ❌✅ (primary) -0.4% [-0.9%, 0.4%] 93

Max RSS (memory usage)

Results (primary 1.9%, secondary 0.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.1% [2.7%, 3.6%] 2
Regressions ❌
(secondary)
2.2% [1.4%, 3.0%] 2
Improvements ✅
(primary)
-0.7% [-0.7%, -0.7%] 1
Improvements ✅
(secondary)
-1.6% [-1.7%, -1.4%] 2
All ❌✅ (primary) 1.9% [-0.7%, 3.6%] 3

Cycles

Results (secondary -2.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
5.6% [5.6%, 5.6%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.4% [-3.9%, -2.7%] 6
All ❌✅ (primary) - - 0

Binary size

Results (primary 0.0%, secondary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.1%] 8
Regressions ❌
(secondary)
0.1% [0.0%, 0.1%] 35
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.0% [0.0%, 0.1%] 8

Bootstrap: 516.831s -> 540.55s (4.59%)
Artifact size: 400.72 MiB -> 398.64 MiB (-0.52%)

@rustbot

This comment has been minimized.

@Daniel-B-Smith

Copy link
Copy Markdown
Contributor Author

The remaining big comment from the review is to decide on extending the existing FileEncoder interface to support hashing before flush or keeping the fork. After that, I can resolve the existing comments, add the disable flag and create a gist with the testing strategy I've been using with other crates.

@rust-bors

This comment has been minimized.

@rustbot

This comment has been minimized.

@rust-bors

This comment has been minimized.

@rustbot

This comment has been minimized.

@rust-bors

This comment has been minimized.

@rustbot

This comment has been minimized.

@rustbot

rustbot commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

The parser was modified, potentially altering the grammar of (stable) Rust
which would be a breaking change.

cc @fmease

HIR ty lowering was modified

cc @fmease

library/core/src/unicode/unicode_data.rs is generated by the src/tools/unicode-table-generator tool.

If you want to modify unicode_data.rs, please modify the tool then regenerate the library source file via ./x run src/tools/unicode-table-generator instead of editing unicode_data.rs manually.

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Some changes occurred in src/tools/compiletest

cc @jieyouxu

clippy is developed in its own repository. If possible, consider making this change to rust-lang/rust-clippy instead.

cc @rust-lang/clippy

The reflection data structures are tied exactly to the implementation
in the compiler. Make sure to also adjust rustc_const_eval/src/const_eval/type_info.rs

cc @oli-obk

Some changes occurred in match checking

cc @Nadrieril

@rustbot

This comment has been minimized.

@rustbot

rustbot commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@Daniel-B-Smith

Copy link
Copy Markdown
Contributor Author

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

The parser was modified, potentially altering the grammar of (stable) Rust which would be a breaking change.

cc @fmease

HIR ty lowering was modified

cc @fmease

library/core/src/unicode/unicode_data.rs is generated by the src/tools/unicode-table-generator tool.

If you want to modify unicode_data.rs, please modify the tool then regenerate the library source file via ./x run src/tools/unicode-table-generator instead of editing unicode_data.rs manually.

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Some changes occurred in src/tools/compiletest

cc @jieyouxu

clippy is developed in its own repository. If possible, consider making this change to rust-lang/rust-clippy instead.

cc @rust-lang/clippy

The reflection data structures are tied exactly to the implementation in the compiler. Make sure to also adjust rustc_const_eval/src/const_eval/type_info.rs

cc @oli-obk

Some changes occurred in match checking

cc @Nadrieril

Apologies for the noise. There was a rebase snafu, so I accidentally a version that pushed included a bunch of changes from upstream. None of those files are in this PR.

@Daniel-B-Smith

Copy link
Copy Markdown
Contributor Author

@cjgillot this should be ready for another round of review. I want to spend a little bit more time doing iterative incremental compilation on a couple of crates, but the revert flag should make the risk relatively low (/it might need more than 12 weeks of nightly before being merged into stable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-CI Area: Our Github Actions CI A-compiletest Area: The compiletest test runner A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-meta Area: Issues & PRs about the rust-lang/rust repository itself A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) A-run-make Area: port run-make Makefiles to rmake.rs A-testsuite Area: The testsuite used to check the correctness of rustc A-tidy Area: The tidy tool O-windows Operating system: Windows perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output. WG-trait-system-refactor The Rustc Trait System Refactor Initiative (-Znext-solver)

Projects

None yet

Development

Successfully merging this pull request may close these issues.