Add memory64 and table64 support to the Canonical ABI#624
Add memory64 and table64 support to the Canonical ABI#624adambratschikaye wants to merge 25 commits intoWebAssembly:mainfrom
Conversation
Parameterize the Canonical ABI to handle 32-bit or 64-bit memory addresses and table indices. This is done by adding two new fields to `LiftOptions` to indicate if the `memory64`/`table64` feature is being used in a core module.
alexcrichton
left a comment
There was a problem hiding this comment.
Thanks so much for helping to pick this up! We chatted a bit on Zulip but for posterity I'll mention here about table-related index types how most of them want to remain 32-bits. I'll take a closer look once that's been passed over here, and in the meantime I skimmed over things and noted a few things here and there. I'm sure though that @lukewagner will have thoughts on this too!
design/mvp/Explainer.md
Outdated
| (param $originalSize i32) | ||
| (func (param $originalPtr $addr) | ||
| (param $originalSize $addr) | ||
| (param $alignment i32) |
There was a problem hiding this comment.
For this I might recommend making alignment have type $addr as well for consistency, and I think that matches the signature in Rust/C/etc as well.
design/mvp/Explainer.md
Outdated
| | -------------------------- | ------------------------ | | ||
| | Approximate WIT signature | `func<T>(t: T) -> T.rep` | | ||
| | Canonical ABI signature | `[t:i32] -> [i32]` | | ||
| | Canonical ABI signature | `[t:$idx] -> [i32]` | |
There was a problem hiding this comment.
This'll be a bit subtle, but this'll actually want to be a mapping of i32 as an argument (I talked with you a bit about this already, but the input here is a host-side index so always 32-bit), but the output here should be something variable. Resource "rep"s are generally pointers in linear memory so 64-bit components are going to want 64-bit storage values. In the component model resources are defined with (rep i32) and validation currently requires that i32 is the only one listed there, but this PR will relax that meaning that the resource type itself stores the lowered type which'll get plumbed here.
There was a problem hiding this comment.
Thanks, I've now changed it so that the types which are expected to be pointers to memory are allowed to be either i32 or i64 without forcing them to match the memory type (e.g. the return value here, or in context.{get,set}, or thread.new-indirect). This means we don't need to require the memory canonopt anywhere where it didn't previously exist.
alexcrichton
left a comment
There was a problem hiding this comment.
Looks good to me! I think this'll need minor adjustments over time but overall looks good 👍
| if opts.memory is None: | ||
| return 'i32' |
There was a problem hiding this comment.
Could this have an assert? This in theory shouldn't ever be called dynamically if memory is None
design/mvp/CanonicalABI.md
Outdated
| * `memory` - this is a subtype of `(memory 1)`. In the rest of the explainer, | ||
| `PTR` will refer to either `i32` or `i64` core Wasm types as determined by the | ||
| type of this `memory`. |
There was a problem hiding this comment.
This'll technically want to say it's a subtype of memory 1 or memory i64 1 since those are separate types
design/mvp/CanonicalABI.md
Outdated
| feature (the "stackful" ABI), this restriction is lifted. | ||
| * 🔀 `callback` - the function has type `(func (param i32 i32 i32) (result i32))` | ||
| and cannot be present without `async` and is only allowed with | ||
| * 🔀 `callback` - the function has type `(func (param i32 i32 PTR) (result i32))` |
There was a problem hiding this comment.
I think this one may remain a three i32s since they're all async-related codes/events/etc.
lukewagner
left a comment
There was a problem hiding this comment.
This looks great overall; thanks so much for all the work and adding tests too! There's only one relatively minor, but non-nit, suggestion below that I'm happy to discuss and then re-review based on what we decide.
Co-authored-by: Luke Wagner <mail@lukewagner.name>
Co-authored-by: Luke Wagner <mail@lukewagner.name>
Co-authored-by: Luke Wagner <mail@lukewagner.name>
Co-authored-by: Luke Wagner <mail@lukewagner.name>
Co-authored-by: Luke Wagner <mail@lukewagner.name>
0b5e62d to
4fd7d8b
Compare
|
Thanks for taking a look @lukewagner! I think I've addressed everything so far. |
lukewagner
left a comment
There was a problem hiding this comment.
Thanks for addressing all the feedback! This is looking great; just a few more suggestions based on the last round of changes:
| validation requires this option to be present (there is no default). | ||
| validation requires this option to be present (there is no default). The types | ||
| of lowered functions may also depend on the [`core:memory-type`] of this memory, | ||
| specifically it's [`core:address-type`] (indicated by `memory.addrtype`), if pointers |
There was a problem hiding this comment.
| specifically it's [`core:address-type`] (indicated by `memory.addrtype`), if pointers | |
| specifically its [`core:address-type`] (indicated by `memory.addrtype`), if pointers |
| def ptr_size(self): | ||
| return ptr_size(self.ptr_type()) | ||
|
|
||
| def equal(lhs, rhs): |
There was a problem hiding this comment.
Is this equal() method used anywhere?
| * If `context.get i32 i` is called after `context.set i64 i v`, | ||
| only the low 32-bits are read (returning `i32.wrap_i64 v`). | ||
| * If `context.get i64 i` is called after `context.set i32 i v`, | ||
| only the upper 32-bits will be zeroed (returning `i64.extend_i32_u v`). |
There was a problem hiding this comment.
| only the upper 32-bits will be zeroed (returning `i64.extend_i32_u v`). | |
| the upper 32-bits will be zero (returning `i64.extend_i32_u v`). |
| * If `context.get i32 i` is called after `context.set i64 i v`, | ||
| only the low 32-bits are read (returning `i32.wrap_i64 v`). | ||
| * If `context.get i64 i` is called after `context.set i32 i v`, | ||
| only the upper 32-bits will be zeroed (returning `i64.extend_i32_u v`). |
There was a problem hiding this comment.
| only the upper 32-bits will be zeroed (returning `i64.extend_i32_u v`). | |
| the upper 32-bits will be zero (returning `i64.extend_i32_u v`). |
|
|
||
| def canon_context_set(t, i, thread, v): | ||
| assert(t == 'i32') | ||
| assert(t == 'i32' or t == 'i64') |
There was a problem hiding this comment.
Perhaps add an assert(v <= MASK_32BIT or t == 'i64')?
|
|
||
| return (s, cx.opts.string_encoding, tagged_code_units) | ||
| string = (s, cx.opts.string_encoding, tagged_code_units) | ||
| trap_if(worst_case_string_byte_length(string) > MAX_STRING_BYTE_LENGTH) |
There was a problem hiding this comment.
Thanks for the iteration! This new rule is nice because it's precise about what fits, and so it feels less arbitrary, but I'm thinking that we should probably avoid adding any runtime overhead (accumulating worst-case size based on the string contents) by sacrificing precision so that we're just doing a simple comparison against a constant. This does mean a lower limit, but I think that's fine because again streams are probably what you want if you're passing really big strings. Because we already have 228 as an ad hoc upper bound in two other places (table length, buffer size), and because it's "pretty big" yet still leaves us plenty of wiggle room, to minimize the number of ad hoc constants, how about trap_if(num_code_units > MAX_STRING_CODE_UNITS) where MAX_STRING_CODE_UNITS = 2 ** 28 - 1? Maybe on the line after this constant definition assert(MAX_STRING_CODE_UNITS * 3 < 2 ** 32), noting that 3 is the worst-case inflation (inflating UTF-16 to UTF-8, referencing store_utf16_to_utf8()).
| Strings are loaded from two pointer-sized values: a pointer (offset in linear | ||
| memory) and a number of [code units]. There are three supported string encodings | ||
| in [`canonopt`]: [UTF-8], [UTF-16] and `latin1+utf16`. This last option allows a | ||
| *dynamic* choice between [Latin-1] and UTF-16, indicated by the 32nd bit of the |
There was a problem hiding this comment.
If we're passing i64s for string/list lengths, then it seems a bit weird to use the 32nd bit. It seems like either we should use i32 lengths or, if using i64 lengths, use the most-significant (64th) bit. I like i64s for regularity, while I like i32s for better capturing the length constraint. Perhaps a concrete tie-breaker is that, if these values get copied into heap structures, the i32 could perhaps be packed with other fields, reducing memory use? If you agree, could we switch the lengths to always be i32s. Or if not, then probably we should go back to your original parameterized utf16_tag(ptr_size) definition.
| return load_list_from_valid_range(cx, ptr, length, elem_type) | ||
|
|
||
| def load_list_from_valid_range(cx, ptr, length, elem_type): | ||
| trap_if(length * worst_case_elem_size(elem_type, cx.opts.memory.ptr_type()) > MAX_LIST_BYTE_LENGTH) |
There was a problem hiding this comment.
worst_case_elem_size has a nice intuitive definition, but I wonder if we can simplify implementers' lives for a relatively corner case by fixing a suitably-conservative constant so that it's simply trap_if(length * elem_size(elem_type, cx.opts.memory.ptr_type()) > MAX_LIST_BYTE_LENGTH). Following strings, to minimize the number of distinct ad hoc constants, maybe MAX_LIST_BYTE_LENGTH = 2 ** 28 - 1 as well?
Independently, I think this trap_if() should be moved into load_list_from_range() since load_list_from_valid_range() is called for stream buffers where we don't have the same "needs to be representable as an i32 constraints since streams can copy as much or little as the reader/writer agree on.
Parameterize the Canonical ABI to handle 32-bit or 64-bit memory addresses and table indices. This is done by adding two new fields to
LiftOptionsto indicate if thememory64/table64feature is being used in a core module.