Skip to content

Provide resolution diagnostics (inline)#502

Open
Morriar wants to merge 9 commits intomainfrom
at-resolver-diagnostics-inline
Open

Provide resolution diagnostics (inline)#502
Morriar wants to merge 9 commits intomainfrom
at-resolver-diagnostics-inline

Conversation

@Morriar
Copy link
Contributor

@Morriar Morriar commented Jan 21, 2026

Same than #474 but resolution diagnostics are pushed during the resolution rather than after.

Inline validation appears to be slightly slower, though I think we're in the noise at this point.

# Validate after

Timing breakdown
  Listing             0.848s (  2.9%)
  Indexing            4.412s ( 15.1%)
  Resolution         23.282s ( 79.9%)
  Querying            0.585s (  2.0%)
  Cleanup             0.000s (  0.0%)
  Total:             29.128s

# Validate inline

Timing breakdown
  Listing             0.784s (  2.6%)
  Indexing            4.809s ( 16.0%)
  Resolution         23.975s ( 79.7%)
  Querying            0.501s (  1.7%)
  Cleanup             0.000s (  0.0%)
  Total:             30.070s

^ @vinistock

@Morriar Morriar requested a review from a team as a code owner January 21, 2026 19:50
@Morriar Morriar self-assigned this Jan 22, 2026
}

if context.cyclic {
for mixin in mixins {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here instead of looping through the mixins, I think we can add the circular dependency diagnostic directly in linearize_ancestors when we conclude that it's a cyclic dependency here:

if !context.seen_ids.insert(declaration_id) {

That way, we automatically add diagnostics for cyclic parent classes too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do it here without some involved refactoring as we're missing the information we need to attach the diagnostic to the proper place. At this stage, we don't know if the conflict comes from a parent class or a mixin.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed. I think we can change the seen from a HashSet to a HashMap, where the keys are still the declaration IDs and the values are the information you need to add the diagnostic at the right place.

That way we can avoid looping through the mixins again.

I think that will also solve the parent class cyclic check and simplify the code. We can do that in a separate PR though.


let result = self.linearize_ancestors(picked_parent, context);

if matches!(result, Ancestors::Cyclic(_))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can go away if we move the diagnostic to that spot in linearize_ancestors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason than #502 (comment).

@Morriar Morriar force-pushed the at-resolver-diagnostics-inline branch 2 times, most recently from 529e109 to 7b12c21 Compare January 23, 2026 19:37
@Morriar
Copy link
Contributor Author

Morriar commented Jan 23, 2026

Just to compare I removed the cyclic checks and ran the numbers again:

# Diagnostics after resolution

Timing breakdown
  Listing             0.923s (  3.1%)
  Indexing            5.204s ( 17.7%)
  Resolution         22.691s ( 77.1%)
  Querying            0.607s (  2.1%)
  Cleanup             0.000s (  0.0%)
  Total:             29.424s

# Diagnostics inline (without cyclic checks)

Timing breakdown
  Listing             0.883s (  2.9%)
  Indexing            6.046s ( 19.9%)
  Resolution         22.966s ( 75.5%)
  Querying            0.512s (  1.7%)
  Cleanup             0.000s (  0.0%)
  Total:             30.407s

# Diagnostics inline (with cyclic checks)

Timing breakdown
  Listing             0.997s (  3.4%)
  Indexing            4.382s ( 14.8%)
  Resolution         23.656s ( 80.0%)
  Querying            0.531s (  1.8%)
  Cleanup             0.000s (  0.0%)
  Total:             29.565s

So even with moving the cyclic checks inline we're not faster than #474.

}
Outcome::Unresolved(None) => {
// We couldn't resolve this name. Emit a diagnostic
unit_queue.push_back(unit_id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for information, I commented this on Thomas' PR too: there's a bit of a mistake here. The original reason I separated Retry from Unresolved is to avoid retrying constants that we know for sure are undefined (i.e.: we had all of its dependencies, but still failed to resolve it).

I think we need a list separate from the unit queue for unresolved constants so that we can avoid retrying these.

At the end of the loop, unresolved constants will be the concatenation of the unresolved constants (things we were sure we couldn't resolve) + whatever is left in the queue (stuff we retried, but still failed to resolve).

We can address this in a separate PR, but let's please leave a TODO, so that we don't forget. This will increase the number of retries.

.map(|mixins| mixins.into_iter().map(|mixin| (mixin, *definition_id)))
})
.flatten()
.collect::<Vec<(Mixin, DefinitionId)>>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now producing an extra intermediate vector, whereas the code before was extending from an iterator.

Are we able to revert to the original?

.map(|mixins| mixins.into_iter().map(|mixin| (mixin, *definition_id)))
})
.flatten()
.collect::<Vec<(Mixin, DefinitionId)>>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here.

}

if context.cyclic {
for mixin in mixins {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed. I think we can change the seen from a HashSet to a HashMap, where the keys are still the declaration IDs and the values are the information you need to add the diagnostic at the right place.

That way we can avoid looping through the mixins again.

I think that will also solve the parent class cyclic check and simplify the code. We can do that in a separate PR though.

@vinistock vinistock mentioned this pull request Jan 29, 2026
vinistock added a commit that referenced this pull request Jan 30, 2026
Step towards #141

When thinking about using u32 IDs, I was getting really caught up by the
fact that Rust's `HashMap` implementation still requires storing a u64
for the key digests. Of course that consumes more memory and ideally we
can have a custom `HashMap` implementation that uses half the amount of
bits.

However, we actually use IDs quite extensively outside of just `HashMap`
keys. For example, many definitions store StringId, names store multiple
NameIds and ancestors is a list of DeclarationId. We can actually have
very substantial memory savings right now even without the 32 bit
`HashMap`.

### Implementation

This PR reduces the size of our IDs to u32 from i64. I recommend
reviewing per commit:

1. Reduce the ID size to u32
2. Allow `DefinitionId` and `ReferenceId` to be tagged with their kind.
This helps us reduce the changes of collision by encoding the kind in
the lower bits of the ID. Essentially, we have 28 bits for the digest +
4 for the kind. Definitions and references are the things we have the
most in Core and the only ones I got conflicts for running the release
mode, so I think this is enough for the time being.

As an added benefit, this allows to check the kind of definitions and
references without having to retrieve them from the graph. That
information is directly encoded in the ID, we can just invoke `kind`.
Note that the `kind` addition conflicts with #502, we can let that PR go
first.
3. Adjust the C side to use u32

### Impact

Despite still using 64 bit `HashMaps`, this PR still reduces our memory
used from `~3810 MB` to
`~2880 MB` (25% reduction).

I believe we can further reduce this with a custom `HashMap`
implementation that stores u32 internally.

### Why now?

I believe this is the right time to do this for the following reason:
we're starting adoption of Rubydex in our tools. It will be easier for
us to verify that we can indeed get away with u32 and lower memory usage
_and then_ increase to u64 if necessary than the other way around.

Also, we're almost getting to 4GB and a reduction is definitely welcome.
@Morriar Morriar force-pushed the at-resolver-diagnostics-inline branch from 7b12c21 to 47c9266 Compare March 5, 2026 16:51
Signed-off-by: Alexandre Terrasa <alexandre.terrasa@shopify.com>
@Morriar Morriar force-pushed the at-resolver-diagnostics-inline branch 5 times, most recently from 14a82d7 to e44d719 Compare March 5, 2026 21:59
Morriar added 8 commits March 5, 2026 17:13
Signed-off-by: Alexandre Terrasa <alexandre.terrasa@shopify.com>
Signed-off-by: Alexandre Terrasa <alexandre.terrasa@shopify.com>
Signed-off-by: Alexandre Terrasa <alexandre.terrasa@shopify.com>
Signed-off-by: Alexandre Terrasa <alexandre.terrasa@shopify.com>
Signed-off-by: Alexandre Terrasa <alexandre.terrasa@shopify.com>
Signed-off-by: Alexandre Terrasa <alexandre.terrasa@shopify.com>
Signed-off-by: Alexandre Terrasa <alexandre.terrasa@shopify.com>
Signed-off-by: Alexandre Terrasa <alexandre.terrasa@shopify.com>
@Morriar Morriar force-pushed the at-resolver-diagnostics-inline branch from e44d719 to 314f593 Compare March 5, 2026 22:14
@Morriar Morriar requested a review from vinistock March 5, 2026 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants