rustdoc JSON: Add implied bounds to generic types, impl Trait, and assoc types.#148379
rustdoc JSON: Add implied bounds to generic types, impl Trait, and assoc types.#148379obi1kenobi wants to merge 1 commit intorust-lang:mainfrom
Conversation
|
|
|
I'm still not particularly familiar with rustdoc or rustc internals, and I definitely don't have very good intuition for how things work yet. This PR is approximately 80h of me doing my best to figure things out on my own. I don't expect I got everything right—there are probably things that could be improved. But I did write lots of tests with all the edge cases I could think of, and I tried hard not to write anything egregiously wrong :) Feedback is very welcome, as is advice on resolving the remaining TODOs. In particular, let me know if you have a preference between adding |
This comment has been minimized.
This comment has been minimized.
|
i’ll take a look at the code later, but for now: @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Add implied bounds to generic types, impl Trait, and assoc types.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (972828a): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 2.0%, secondary 1.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 2.6%, secondary -4.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 1.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 474.434s -> 474.73s (0.06%) |
There was a problem hiding this comment.
I've only looked at the rustdoc-json side, but it seems fine. I'm not comfortable reviewing the ty/clean side of this, so I'll leave that to fmease.
I've also not yet looked an the tests in detail. But they seem quite comprehensive, thanks.
Could you move all the implied-bounds-*.rs tests into a new implied-bounds/ folder?
src/librustdoc/clean/inline.rs
Outdated
| .map(|item| clean_impl_item(item, cx)) | ||
| .collect::<Vec<_>>(), | ||
| clean_generics(impl_.generics, cx), | ||
| clean_generics(impl_.generics, did, cx), |
There was a problem hiding this comment.
N.B. This isn't tested, as rustdoc json doesn't inline. I'm pretty certain it's correct, but worth flagging.
Yes, I was going to ask if you're okay with that actually. Happy to do it prior to merging; will leave as-is for now for continuity of review in case fmease has already started reviewing. |
|
☔ The latest upstream changes (presumably #139558) made this pull request unmergeable. Please resolve the merge conflicts. |
b35832e to
43cc2c3
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
I'm back from my trip and ready to pick this back up, pending reviewer availability ofc 👍 |
This comment has been minimized.
This comment has been minimized.
95ae8cd to
36749ca
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
I try to get to this before the end of the week & I'll try my hardest to get this PR to ship sooner than later 🤞 Don't hesitate to ping me. |
|
Thank you, I appreciate it! I'll try to keep it free of merge conflicts in the meantime. |
36749ca to
825939e
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
Gentlest of nudges @fmease if you might have time this week 🙏 Totally understand if you're swamped with other higher-priority things, of course. I'll be AFK all of next week, so if any changes are needed after Saturday, this PR will end up slipping into April. In an ideal world, I'll have some cycles in the first week of April and I'd like to prep a follow-up PR that adds implied "outlives" bounds to lifetime parameters — as a reminder, this PR only adds implied bounds to type parameters, so we have a functionality gap for lifetimes here for reasons of making review easier by limiting scope. |
|
@bors try @rust-timer queue profiles=DocJson |
|
Error occurred while parsing comment: Cannot parse profiles: Invalid profile: DocJson. Valid values are: check, debug, opt, doc, doc-json, clippy |
This comment has been minimized.
This comment has been minimized.
rustdoc JSON: Add implied bounds to generic types, impl Trait, and assoc types.
|
@rust-timer queue profiles=doc-json |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (c2c1520): comparison URL. Overall result: ❌ regressions - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 1.3%, secondary 0.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 10.6%, secondary 1.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 483.836s -> 485.532s (0.35%) |
|
The regression is way too big, a lot of optimization needs to be done before it's merged. |
| let auto_traits = renderer | ||
| .tcx | ||
| .visible_traits() | ||
| .filter(|&trait_def_id| renderer.tcx.trait_is_auto(trait_def_id)) | ||
| .collect(); |
There was a problem hiding this comment.
[perf?] this gets executed for every single type param, assoc type & opaque type in the local crate.
Why is this even necessary? AFAICT this is only ever accessed when synthesizing auto trait impls … which clean should've already done when we reach this function? It doesn't seem necessary for cleaning the things generated in this module.
There was a problem hiding this comment.
clean demands a DocContext<'tcx>, so I tried to faithfully produce one. I don't have intuition for which fields are used vs not used in which part of clean, and leaving this field set to Default::default() felt like it might plant a difficult-to-find bug for later.
If you're confident we can set it to Default::default() I'm happy to do that.
| let param_env = tcx.param_env(parent); | ||
| let parent_ty = tcx.type_of(parent).instantiate_identity(); | ||
| let infcx = tcx.infer_ctxt().build(TypingMode::non_body_analysis()); | ||
| let env = OutlivesEnvironment::new(&infcx, local_parent, param_env, [parent_ty]); |
There was a problem hiding this comment.
[perf?] This computation looks quite costly (param_env query call overhead, OutlivesEnvironment::new computing implied bounds ~from scratch) but it only depends on parent which is potentially shared by several opaque types; might be worth caching might not be.
| if tcx.def_kind(opaque_def_id) != DefKind::OpaqueTy { | ||
| return Vec::new(); | ||
| } |
There was a problem hiding this comment.
[perf?] Is it ever possible for non-opaque types to end up here? Can type aliases / assoc tys show up here? Or is this just a defensive check? If it's the latter, I would just throw it out or turn it into a debug_assert. That's because calling queries (here: def_kind) always has a call overhead that can be detrimental if this fn is hot.
| use crate::json::JsonRenderer; | ||
| use crate::json::conversions::IntoJson; | ||
|
|
||
| pub(crate) fn implied_bounds_for_ty<'tcx>( |
There was a problem hiding this comment.
Most of these cleaning fns are just copy/paste, right? I did already express my concern about this in the past. I'm not asking you to address that in this PR but I'd like to see these getting removed again sooner rather than later because as is I'm afraid to say this renders the rustdoc codebase even harder to maintain.
Not to make it too personal but I did already burn out ~2 years ago trying to rewrite these specific cleaning routines because they're broken in so many ways (cc issue #113015 for example) and I haven't regained the energy yet to tackle it again. I'm very much afraid of having to fix multiple versions again (after I was able to get rid of so much duplication already, cc PR #123340).
If they're essentially a copy/paste of the clean version, it should very much be possible to consolidate them via a trait + static/dynamic dispatch or an enum.
There was a problem hiding this comment.
Yes, there's an uncomfortable compromise here because clean and this code have slightly mismatched needs and expectations. I attempted to reuse as much of clean as possible and had to duplicate some parts.
I called out this issue already in the Open questions for this PR section of the PR description -- copy-pasting here for reading convenience:
DocContextuse outside ofclean. Per T-rustdoc recommendations, this implementation moves all implied bounds computation into the JSON component. In turn that means the JSON-side code needs to do some cleaning of the clauses that represent those implied bounds. To avoid duplication, the current implementation reuses thecleancode — but that requires aDocContextwhich is not otherwise available in the JSON component.
- Currently, we construct a
DocContextwhen needed — which may be often, and may not be a trivial cost. I'm not sure!- Instead, we could duplicate the code we need from
cleaninto the JSON component, adjusting it to not needDocContext. This isn't great either.- Instead, we could put a
DocContextinsideJsonRenderersomehow. This isn't great either, becauseJsonRendereris passed around by immutable reference, whilecleanuses&mut DocContext. We may need eitherRefCellorMutexor lots of plumbing to replace&JsonRendererwith&mut JsonRenderer.- Possibly other options? I'd love reviewer feedback on this!
I too found this code very difficult to work with, so I understand why you are also hesitant about it. The tradeoffs here suck and I genuinely don't see how to reuse more of clean in a sensible way. If you have ideas, I'm happy to look into them.
| let bound_trait_def_id = trait_clause.def_id(); | ||
| if !is_allowed_lang_item_trait(clean_cx.tcx, bound_trait_def_id) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
[perf?] It should be possible to move this check right where trait_clause is defined? That would avoid calling query opt_item_ident, the queries called by AliasTy::trait_ref and queries like explicit_supertraits_containing_assoc_item as called by projection_applies_to_trait_ref (all of them which have call overhead). Ofc this fn / part of the fn might not be hot, so it might not be worth it.
| None => true, | ||
|
|
||
| // These are all the language item traits that are stable in Rust today. | ||
| Some( |
There was a problem hiding this comment.
I've already expressed my concern somewhere about this list but I've also acknowledged that not filtering out certain traits would mean having to unleash the expensive implied bounds calculation to … "everything" … except that we're already doing that regardless because is_allowed_lang_item_trait is called on the computed implied bounds or did I misread? Well, right I guess, we're avoid the cleaning and the huge increase in JSON artifact size, hmm, now I remember.
Re. the explicit list, I don't know what the future plans are esp. wrt. to maintenance: Are you expecting people stabilizing new lang items to know about this list and extend it over time? Clearly that's too much of an ask. Moreover as I already wrote somewhere sometime ago consumers don't expect there to be a filter.
I know what I'm about to say isn't really actionable but I'd like to get it off my chest: In my opinion, filtering out most bounds is a band-aid solution that doesn't have a "robust and principled future" and it's an indicator that eagerly executing "queries" (here: implied bounds) and baking their result into the JSON artifact is not sustainable or scalable to more "queries" (wrt. to size, time & maintenance); I don't think it's realistic or reasonable for future r-JSON to have an arbitrary set of queries (implied bounds, layout info, …) that gets executed unconditionally for every r-JSON consumer (cc #143197 (comment)) if they want to or not.
(At this point I'm wondering if we should allow inputs to r-JSON like CLI flags with which you can call certain "queries" like a web API or RPC)
There was a problem hiding this comment.
Right, as I mentioned in Open questions for this PR:
- List of stable lang-item traits. I wasn't able to find a way to get a list of stable lang-item traits that we should consider for implied bounds analysis. We currently have a hard-coded list of them, which may be brittle as new lang-item traits are stabilized. If there's a better way to do this, I'm all ears.
A "inputs-based" way to query r-JSON would be super useful and cargo-semver-checks would be in excellent position to make use of it.
|
@fmease thanks for the review! Since you raised several "should we even do this" style comments, I'll wait for you and @GuillaumeGomez to consider my replies before diving into attempting to optimize this. If you don't see a path forward for this PR, or are no longer convinced this is the right approach, please just tell me quickly :) The worst possible case IMHO is that we all spend a lot of time iterating on this PR over the course of months, and then in the end still decide not to merge it. The API tension between |
View all comments
Include implied and elaborated (henceforth, implied) bounds in the rustdoc JSON representations of associated types,
impl Traittypes, and generic type parameters. Implemented as a newimplied_bounds: Vec<GenericBound>field, alongside the existingbounds: Vec<GenericBound>field that stores explicit (syntactic) bounds. An attempt at implementing #143197, based on the feedback and discussion in #143559 (comment).Rustdoc JSON distingushes between explicit bounds specified together with the generic parameter definition versus ones given via
whereclauses (which we do not change). The design of this PR considers implied bounds an inherent property of the generic type parameter, and does not distinguish the source of the implied bound between the two sets of explicit bounds. I believe this simplifies the design and implementation without hurting any use existing or realistic future use case.Per T-rustdoc feedback, the implementation is JSON-only and aims to minimize changes to
cleanand elsewhere. This is intended to mitigate possible performance impact on rustdoc as a whole, while providing this data to JSON-based workloads likecargo-semver-checks.Please see below for what this PR does, what is deferred to future PRs, and what open questions remain.
Please also note that half the line count is the test suite, and a bunch more is just "adjust dozens of pattern match sites to use a slightly different shape," so this PR isn't as large as it seems 😅
Recommended review order:
src/rustdoc-json-types/lib.rssrc/librustdoc/clean/types.rssrc/librustdoc/clean/mod.rssrc/librustdoc/json/implied_bounds.rssrc/librustdoc/json/conversions.rsWork deferred to future PRs
T: 'acan be an implied bound,'a: 'bcan be an implied bound too. The former is implemented in this PR. To limit scope, I intend to open a separate PR for the latter after this one is merged.Sizedat its use site. Fixing this seemed like a big lift for a somewhat uncommon case, so I felt it's best tackled later — perfect being the enemy of good enough and all.Open questions for this PR
DocContextuse outside ofclean. Per T-rustdoc recommendations, this implementation moves all implied bounds computation into the JSON component. In turn that means the JSON-side code needs to do some cleaning of the clauses that represent those implied bounds. To avoid duplication, the current implementation reuses thecleancode — but that requires aDocContextwhich is not otherwise available in the JSON component.DocContextwhen needed — which may be often, and may not be a trivial cost. I'm not sure!cleaninto the JSON component, adjusting it to not needDocContext. This isn't great either.DocContextinsideJsonRenderersomehow. This isn't great either, becauseJsonRendereris passed around by immutable reference, whilecleanuses&mut DocContext. We may need eitherRefCellorMutexor lots of plumbing to replace&JsonRendererwith&mut JsonRenderer.Included tests
Fn/FnMut/FnOnceandAsyncFn/AsyncFnMut/AsyncFnOnceimplied bounds preserve full parenthesized signatures.Sizedbounds as a result of Rust language rules:Sizeddefault when?Sizedis not present, such as in generic parameters, associated types, etc.Sizedrequirement for function parameters, return values, and contents of slices and arrays.Sizedrequirement when a possibly-DST struct or tuple (e.g.struct S<T: ?Sized>(T);) is used in a position that requires it beSized, like a function parameter or return value.Sizedif such a possibly-DST type is used behind indirection, like&/&mut/*const/*mut.bounds(explicit, syntactically-included bounds) orimplied_bounds(implied or elaborated bounds), never both.r? fmease
cc @aDotInTheVoid