Add memory64 and table64 support to the Canonical ABI by adambratschikaye · Pull Request #624 · WebAssembly/component-model

adambratschikaye · 2026-03-17T14:47:08Z

Parameterize the Canonical ABI to handle 32-bit or 64-bit memory addresses and table indices. This is done by adding two new fields to LiftOptions to indicate if the memory64/table64 feature is being used in a core module.

Parameterize the Canonical ABI to handle 32-bit or 64-bit memory addresses and table indices. This is done by adding two new fields to `LiftOptions` to indicate if the `memory64`/`table64` feature is being used in a core module.

alexcrichton

Thanks so much for helping to pick this up! We chatted a bit on Zulip but for posterity I'll mention here about table-related index types how most of them want to remain 32-bits. I'll take a closer look once that's been passed over here, and in the meantime I skimmed over things and noted a few things here and there. I'm sure though that @lukewagner will have thoughts on this too!

alexcrichton · 2026-03-17T16:06:12Z

design/mvp/Explainer.md

-      (param $originalSize i32)
+(func (param $originalPtr $addr)
+      (param $originalSize $addr)
      (param $alignment i32)


For this I might recommend making alignment have type $addr as well for consistency, and I think that matches the signature in Rust/C/etc as well.

alexcrichton · 2026-03-17T16:07:46Z

design/mvp/Explainer.md

 | -------------------------- | ------------------------ |
 | Approximate WIT signature  | `func<T>(t: T) -> T.rep` |
-| Canonical ABI signature    | `[t:i32] -> [i32]`       |
+| Canonical ABI signature    | `[t:$idx] -> [i32]`      |


This'll be a bit subtle, but this'll actually want to be a mapping of i32 as an argument (I talked with you a bit about this already, but the input here is a host-side index so always 32-bit), but the output here should be something variable. Resource "rep"s are generally pointers in linear memory so 64-bit components are going to want 64-bit storage values. In the component model resources are defined with (rep i32) and validation currently requires that i32 is the only one listed there, but this PR will relax that meaning that the resource type itself stores the lowered type which'll get plumbed here.

Thanks, I've now changed it so that the types which are expected to be pointers to memory are allowed to be either i32 or i64 without forcing them to match the memory type (e.g. the return value here, or in context.{get,set}, or thread.new-indirect). This means we don't need to require the memory canonopt anywhere where it didn't previously exist.

alexcrichton

Looks good to me! I think this'll need minor adjustments over time but overall looks good 👍

alexcrichton · 2026-03-20T06:31:12Z

design/mvp/canonical-abi/definitions.py

+  if opts.memory is None:
+    return 'i32'


Could this have an assert? This in theory shouldn't ever be called dynamically if memory is None

alexcrichton · 2026-03-20T06:34:21Z

design/mvp/CanonicalABI.md

+* `memory` - this is a subtype of `(memory 1)`. In the rest of the explainer,
+  `PTR` will refer to either `i32` or `i64` core Wasm types as determined by the
+  type of this `memory`.


This'll technically want to say it's a subtype of memory 1 or memory i64 1 since those are separate types

alexcrichton · 2026-03-20T06:34:55Z

design/mvp/CanonicalABI.md

  feature (the "stackful" ABI), this restriction is lifted.
-* 🔀 `callback` - the function has type `(func (param i32 i32 i32) (result i32))`
-  and cannot be present without `async` and is only allowed with
+* 🔀 `callback` - the function has type `(func (param i32 i32 PTR) (result i32))`


I think this one may remain a three i32s since they're all async-related codes/events/etc.

lukewagner

This looks great overall; thanks so much for all the work and adding tests too! There's only one relatively minor, but non-nit, suggestion below that I'm happy to discuss and then re-review based on what we decide.

design/mvp/canonical-abi/definitions.py

design/mvp/canonical-abi/run_tests.py

design/mvp/Concurrency.md

design/mvp/CanonicalABI.md

design/mvp/canonical-abi/definitions.py

Co-authored-by: Luke Wagner <mail@lukewagner.name>

adambratschikaye · 2026-03-26T15:40:44Z

Thanks for taking a look @lukewagner! I think I've addressed everything so far.

lukewagner

Thanks for addressing all the feedback! This is looking great; just a few more suggestions based on the last round of changes:

lukewagner · 2026-03-26T17:47:38Z

design/mvp/Explainer.md

-validation requires this option to be present (there is no default).
+validation requires this option to be present (there is no default). The types
+of lowered functions may also depend on the [`core:memory-type`] of this memory,
+specifically it's [`core:address-type`] (indicated by `memory.addrtype`), if pointers


Suggested change

specifically it's [`core:address-type`] (indicated by `memory.addrtype`), if pointers

specifically its [`core:address-type`] (indicated by `memory.addrtype`), if pointers

lukewagner · 2026-03-26T17:52:03Z

design/mvp/canonical-abi/definitions.py

+  def ptr_size(self):
+    return ptr_size(self.ptr_type())
+
+  def equal(lhs, rhs):


Is this equal() method used anywhere?

lukewagner · 2026-03-26T17:56:35Z

design/mvp/Explainer.md

+* If `context.get i32 i` is called after `context.set i64 i v`,
+  only the low 32-bits are read (returning `i32.wrap_i64 v`).
+* If `context.get i64 i` is called after `context.set i32 i v`,
+  only the upper 32-bits will be zeroed (returning `i64.extend_i32_u v`).


Suggested change

only the upper 32-bits will be zeroed (returning `i64.extend_i32_u v`).

the upper 32-bits will be zero (returning `i64.extend_i32_u v`).

lukewagner · 2026-03-26T17:57:08Z

design/mvp/Explainer.md

+* If `context.get i32 i` is called after `context.set i64 i v`,
+  only the low 32-bits are read (returning `i32.wrap_i64 v`).
+* If `context.get i64 i` is called after `context.set i32 i v`,
+  only the upper 32-bits will be zeroed (returning `i64.extend_i32_u v`).


Suggested change

only the upper 32-bits will be zeroed (returning `i64.extend_i32_u v`).

the upper 32-bits will be zero (returning `i64.extend_i32_u v`).

lukewagner · 2026-03-26T18:07:25Z

design/mvp/canonical-abi/definitions.py


 def canon_context_set(t, i, thread, v):
-  assert(t == 'i32')
+  assert(t == 'i32' or t == 'i64')


Perhaps add an assert(v <= MASK_32BIT or t == 'i64')?

design/mvp/Concurrency.md

lukewagner · 2026-03-26T19:14:47Z

design/mvp/CanonicalABI.md


-  return (s, cx.opts.string_encoding, tagged_code_units)
+  string = (s, cx.opts.string_encoding, tagged_code_units)
+  trap_if(worst_case_string_byte_length(string) > MAX_STRING_BYTE_LENGTH)


Thanks for the iteration! This new rule is nice because it's precise about what fits, and so it feels less arbitrary, but I'm thinking that we should probably avoid adding any runtime overhead (accumulating worst-case size based on the string contents) by sacrificing precision so that we're just doing a simple comparison against a constant. This does mean a lower limit, but I think that's fine because again streams are probably what you want if you're passing really big strings. Because we already have 2²⁸ as an ad hoc upper bound in two other places (table length, buffer size), and because it's "pretty big" yet still leaves us plenty of wiggle room, to minimize the number of ad hoc constants, how about trap_if(num_code_units > MAX_STRING_CODE_UNITS) where MAX_STRING_CODE_UNITS = 2 ** 28 - 1? Maybe on the line after this constant definition assert(MAX_STRING_CODE_UNITS * 3 < 2 ** 32), noting that 3 is the worst-case inflation (inflating UTF-16 to UTF-8, referencing store_utf16_to_utf8()).

lukewagner · 2026-03-26T19:23:47Z

design/mvp/CanonicalABI.md

+Strings are loaded from two pointer-sized values: a pointer (offset in linear
+memory) and a number of [code units]. There are three supported string encodings
+in [`canonopt`]: [UTF-8], [UTF-16] and `latin1+utf16`. This last option allows a
+*dynamic* choice between [Latin-1] and UTF-16, indicated by the 32nd bit of the


If we're passing i64s for string/list lengths, then it seems a bit weird to use the 32nd bit. It seems like either we should use i32 lengths or, if using i64 lengths, use the most-significant (64th) bit. I like i64s for regularity, while I like i32s for better capturing the length constraint. Perhaps a concrete tie-breaker is that, if these values get copied into heap structures, the i32 could perhaps be packed with other fields, reducing memory use? If you agree, could we switch the lengths to always be i32s. Or if not, then probably we should go back to your original parameterized utf16_tag(ptr_size) definition.

lukewagner · 2026-03-26T20:30:16Z

design/mvp/CanonicalABI.md

  return load_list_from_valid_range(cx, ptr, length, elem_type)

 def load_list_from_valid_range(cx, ptr, length, elem_type):
+  trap_if(length * worst_case_elem_size(elem_type, cx.opts.memory.ptr_type()) > MAX_LIST_BYTE_LENGTH)


worst_case_elem_size has a nice intuitive definition, but I wonder if we can simplify implementers' lives for a relatively corner case by fixing a suitably-conservative constant so that it's simply trap_if(length * elem_size(elem_type, cx.opts.memory.ptr_type()) > MAX_LIST_BYTE_LENGTH). Following strings, to minimize the number of distinct ad hoc constants, maybe MAX_LIST_BYTE_LENGTH = 2 ** 28 - 1 as well?

Independently, I think this trap_if() should be moved into load_list_from_range() since load_list_from_valid_range() is called for stream buffers where we don't have the same "needs to be representable as an i32 constraints since streams can copy as much or little as the reader/writer agree on.

Add memory64 and table64 support to the Canonical ABI

746f251

Parameterize the Canonical ABI to handle 32-bit or 64-bit memory addresses and table indices. This is done by adding two new fields to `LiftOptions` to indicate if the `memory64`/`table64` feature is being used in a core module.

alexcrichton reviewed Mar 17, 2026

View reviewed changes

adambratschikaye added 4 commits March 18, 2026 15:14

Correct types that are in host tables.

2d35ea9

Use existing canonical options and static parameters

a18568e

typo

99274e8

small fixes

2955e03

alexcrichton reviewed Mar 20, 2026

View reviewed changes

review comments

0fd86a2

lukewagner requested changes Mar 23, 2026

View reviewed changes

adambratschikaye and others added 15 commits March 24, 2026 10:47

some test cleanup

4eec09d

fix spacing

1a24955

Co-authored-by: Luke Wagner <mail@lukewagner.name>

fix spacing

dfa163f

Co-authored-by: Luke Wagner <mail@lukewagner.name>

fix spacing

8036409

Co-authored-by: Luke Wagner <mail@lukewagner.name>

inline table.addrtype

6824cc0

Co-authored-by: Luke Wagner <mail@lukewagner.name>

inline table.addrtype

b3d4804

Co-authored-by: Luke Wagner <mail@lukewagner.name>

fixup inline table.addrtype

24a917c

add MemInst class

940e989

inline memory defaults in tests

e06bda6

revert max string length to a const

0270ee3

replace PTR with memory.addrtype

97d6301

note on buffer size

322c1ba

clarify address type

f834bbb

mixed context types behavior

8ab246a

fix 32 bit mask

4fd7d8b

adambratschikaye force-pushed the abk/memory64 branch from 0b5e62d to 4fd7d8b Compare March 25, 2026 13:27

adambratschikaye added 4 commits March 25, 2026 14:05

transparent indexing on MemInst

50d68b6

undo old change

c0ad662

revert some small changes

70e876e

check list and string lengths when loading

520cf25

adambratschikaye marked this pull request as ready for review March 26, 2026 15:39

adambratschikaye requested a review from lukewagner March 26, 2026 15:39

lukewagner reviewed Mar 26, 2026

View reviewed changes

	specifically it's [`core:address-type`] (indicated by `memory.addrtype`), if pointers
	specifically its [`core:address-type`] (indicated by `memory.addrtype`), if pointers

	only the upper 32-bits will be zeroed (returning `i64.extend_i32_u v`).
	the upper 32-bits will be zero (returning `i64.extend_i32_u v`).

Conversation

adambratschikaye commented Mar 17, 2026

Uh oh!

alexcrichton left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexcrichton left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukewagner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adambratschikaye commented Mar 26, 2026

Uh oh!

lukewagner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants