Skip to content

Intra-module Sandboxing: Zero-Cost Isolation for Statically Linked WebAssembly Modules #626

@ttraenkler

Description

@ttraenkler

In response to Does the Component Model Require Extra Copying? by @lukewagner, I'd like to share an idea that could be called intra-module sandboxing — a possible point in the design space between shared-nothing and shared-everything, aimed at eliminating the runtime cost of cross-module boundaries in general: bulk memory copies, Canonical ABI lift/lower overhead, hot-spot function call penalties, and duplicated runtime library code — while preserving interface-enforced isolation as a statically checkable property.

Idea

After merging two modules (à la wasm-merge), each module retains its own state and still communicates only through declared imports/exports. Instead of paying the cost of cross-module boundaries at runtime, functions exported by module A have all their indexed state — memories, tables, globals, tags — statically embedded as Wasm immediates after the merge, making them candidates for full inlining by a post-merge optimizer or JIT. Linear memory with multi-memory enabled is one concrete instance of this: module A exports accessor functions that, after the merge, have A's memory index statically embedded as a Wasm immediate:

;; A owns a UTF-8 string in its linear memory (index 1 after merge)
;; B calls this without ever receiving a copy of the buffer
(func (export "string_byte") (param $base i32) (param $i i32) (result i32)
  (i32.load8_u (i32.add (local.get $base) (local.get $i)))
)

Module B calls string_byte without ever holding a raw pointer into A's memory. If the accessor is inlined by a post-merge optimizer (e.g. Binaryen) or the JIT, it reduces to a single i32.load8_u — the sandbox boundary is enforced at the pre-merge interface level and completely erased by inlining. No bulk copy, no lift/lower ABI glue, no call overhead. The same applies to any hot-spot function call across the boundary regardless of the types involved: after merge and inlining, the Canonical ABI overhead disappears entirely.

Intra-module sandboxing could be a third linking mode alongside shared-nothing and shared-everything, supported by composition toolchains like wac or jco — chosen at link time depending on the trust and performance requirements of a given module pair.

(A related but distinct direction is allowing direct bounded memory region access across module boundaries enforced by static analysis — essentially borrow and move semantics for linear memory regions at the binary level, complementing what resource own and resource borrow already do at the WIT level. This is a bit related to #568 but needs no new instructions.)

Sharing Trusted Libraries Across Sandboxed Modules

For a trusted runtime library like wasi-libc, the current options are unsatisfying:

  • Shared-nothing: each module gets its own copy of libc — safe, but N copies means N times the binary size and no code sharing
  • Shared-everything: one copy of libc shared across all modules — no duplication, but full access to each other's memory, eliminating isolation entirely
  • Intra-module sandboxing with N copies: each sandboxed module could bundle its own libc and merge it — still N copies, but isolation is preserved. No improvement over shared-nothing on code size.

The better path is to parameterize libc over memory index and have the toolchain generate a thin adapter wrapper per call site with the index hardwired, so a single shared copy of libc can operate on any caller's memory partition without duplication and without sacrificing isolation:

(func $memcpy_generic (param $mem i32) (param $dst i32) (param $src i32) (param $len i32)
  ;; uses $mem as dynamic index
)

;; Generated adapter — trivially small, inlines at the call site
(func $memcpy_mem1 (param $dst i32) (param $src i32) (param $len i32)
  (call $memcpy_generic (i32.const 1) (local.get $dst) (local.get $src) (local.get $len))
)

After inlining and constant propagation, i32.const 1 flows into $memcpy_generic and the dynamic dispatch is eliminated — recovering a static immediate without duplicating the function body. Code size stays flat regardless of how many modules share the library.

Importantly, this does not require changes to library source code or compilers. A binary post-processing tool could scan any existing compiled Wasm library — including wasi-libc shipped as a plain .wasm file — identify exported functions that access linear memory, and automatically emit parameterized variants alongside the originals. This makes the adapter wrapper approach applicable to the existing compiled library ecosystem without modifications upstream.

Shipping Pre-Merge Modules and Provenance

The Component Model already has a well-defined notion of embedding multiple core modules as nested modules inside a single component binary. If pre-merge modules are shipped this way, the component format itself encodes which module is which — the runtime can read the structure, perform the merge itself, and has full provenance knowledge without any custom section or cryptographic binding. This is the clean path: runtime-owned linking via the component nested module format eliminates the provenance problem entirely, since the runtime never needs to trust an external record of index ownership.

This is not without precedent at the toolchain level. Clang already produces Wasm object files (.o) that embed multiple core modules with symbol and relocation metadata, encoding module boundaries in a structured container that the linker reads and resolves. The component nested module format is a natural evolution of the same idea at the component level — a structured multi-module container that the runtime can link, validate, and merge with full provenance knowledge.

The harder provenance case only arises for flat binaries produced by offline tools like wasm-merge outside the component ecosystem since the runtime cannot validate security boundaries are kept after the merge. This would shift the trust boundary to the point of merge since the result can be tampered with - possible today without runtime support but this validation should go in the component linker and ultimately the runtime when they gain component support.

Other Global State

The same principle extends to all indexed namespaces in Wasm, since all use static integer immediates. After the merge, assign disjoint index ranges per originating module and verify that each function only references indices from its own range — except through explicitly exported/imported accessor functions:

  • Tables: cross-module table.get, table.set, or call_indirect against another module's table index is statically detectable
  • Globals: a validator can confirm no instruction in B references a global index belonging to A
  • Tags: the same partition applies to throw and catch instructions

Wasm GC is the harder case. GC allocations land on a shared opaque heap with no index immediate to partition on. One possible direction: validate that B does not instantiate types from A's type section directly, only receiving them as opaque references through A's exports. Whether that is sufficient — and how it interacts with structural type equivalence — is an open question.

Open Questions

  1. Tooling: Should wac, jco, or wit-bindgen expose intra-module sandboxing as a first-class linking mode alongside shared-nothing and shared-everything, generating and verifying accessor patterns and adapter wrappers automatically? Or is this better handled as a dedicated binary post-processing step?
  2. Dynamic memory index instructions: Would a memory.load/memory.store variant accepting a runtime memory index be worth pursuing as a Wasm proposal, to make the adapter wrapper pattern expressible without toolchain-specific conventions?

cc @lukewagner @fitzgen @alexcrichton @kripken @tlively — pointers to prior discussion and obvious holes very welcome.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions