Make IDs 32bits by vinistock · Pull Request #542 · Shopify/rubydex

vinistock · 2026-01-29T21:15:56Z

Step towards #141

When thinking about using u32 IDs, I was getting really caught up by the fact that Rust's HashMap implementation still requires storing a u64 for the key digests. Of course that consumes more memory and ideally we can have a custom HashMap implementation that uses half the amount of bits.

However, we actually use IDs quite extensively outside of just HashMap keys. For example, many definitions store StringId, names store multiple NameIds and ancestors is a list of DeclarationId. We can actually have very substantial memory savings right now even without the 32 bit HashMap.

Implementation

This PR reduces the size of our IDs to u32 from i64. I recommend reviewing per commit:

Reduce the ID size to u32
Allow DefinitionId and ReferenceId to be tagged with their kind. This helps us reduce the changes of collision by encoding the kind in the lower bits of the ID. Essentially, we have 28 bits for the digest + 4 for the kind. Definitions and references are the things we have the most in Core and the only ones I got conflicts for running the release mode, so I think this is enough for the time being.

As an added benefit, this allows to check the kind of definitions and references without having to retrieve them from the graph. That information is directly encoded in the ID, we can just invoke kind. Note that the kind addition conflicts with Provide resolution diagnostics (inline) #502, we can let that PR go first.
Adjust the C side to use u32

Impact

Despite still using 64 bit HashMaps, this PR still reduces our memory used from ~3810 MB to
~2880 MB (25% reduction).

I believe we can further reduce this with a custom HashMap implementation that stores u32 internally.

Why now?

I believe this is the right time to do this for the following reason: we're starting adoption of Rubydex in our tools. It will be easier for us to verify that we can indeed get away with u32 and lower memory usage and then increase to u64 if necessary than the other way around.

Also, we're almost getting to 4GB and a reduction is definitely welcome.

vinistock · 2026-01-29T21:16:16Z

Make IDs 32bits #542 👈 (View in Graphite)
Reduce ref count sizes #541
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

Morriar · 2026-01-30T15:01:46Z

rust/rubydex/src/model/ids.rs

 #[derive(PartialEq, Eq, Debug, Clone, Copy)]
 pub struct DeclarationMarker;
 /// `DeclarationId` represents the ID of a fully qualified name. For example, `Foo::Bar` or `Foo#my_method`
 pub type DeclarationId = Id<DeclarationMarker>;


Should we also tag declarations?

I haven't figure out a good way to do it. Consider our Ruby API:

graph["Foo"]

The Ruby API has no way of knowing what Foo is before looking it up, so we wouldn't be able to construct the right DeclarationId to search.

For now, I think it's okay because we have a lot fewer declarations than definitions.

Morriar · 2026-01-30T15:02:50Z

rust/rubydex/src/model/definitions.rs

    }
 }

+#[repr(u8)]


You should be able to fit this in 4 bits?

Is there a way to do this? u8 is the smallest integer storage we can use no?

Morriar · 2026-01-30T15:03:15Z

rust/rubydex/src/model/references.rs

    offset::Offset,
 };

+#[repr(u8)]


4 bits should be enough?

rust/rubydex/src/resolution.rs

vinistock mentioned this pull request Jan 29, 2026

Reduce ref count sizes #541

Merged

vinistock self-assigned this Jan 29, 2026

vinistock marked this pull request as ready for review January 29, 2026 21:33

vinistock requested a review from a team as a code owner January 29, 2026 21:33

Morriar approved these changes Jan 30, 2026

View reviewed changes

Base automatically changed from 01-28-reduce_ref_count_sizes to main January 30, 2026 16:05

vinistock added 3 commits January 30, 2026 11:08

Make IDs 32bits

8293574

Allow definition and reference IDs to be tagged

24aca6a

Adjust C side for u32 IDs

f486994

vinistock force-pushed the 01-28-id_exploration branch from 9d72fcd to f486994 Compare January 30, 2026 18:14

vinistock merged commit 6888476 into main Jan 30, 2026
27 checks passed

vinistock deleted the 01-28-id_exploration branch January 30, 2026 21:01

thomasmarshall mentioned this pull request Feb 4, 2026

Add mechanism for diffing graphs #557

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make IDs 32bits#542

Make IDs 32bits#542
vinistock merged 3 commits intomainfrom
01-28-id_exploration

vinistock commented Jan 29, 2026 •

edited

Loading

Uh oh!

vinistock commented Jan 29, 2026 •

edited

Loading

Uh oh!

Morriar Jan 30, 2026

Uh oh!

vinistock Jan 30, 2026

Uh oh!

Morriar Jan 30, 2026

Uh oh!

vinistock Jan 30, 2026

Uh oh!

Morriar Jan 30, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vinistock commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation

Impact

Why now?

Uh oh!

vinistock commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Morriar Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

vinistock Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Morriar Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

vinistock Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Morriar Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vinistock commented Jan 29, 2026 •

edited

Loading

vinistock commented Jan 29, 2026 •

edited

Loading