Conversation
…tions dag-pb serializes Links before Data on the wire, but the UnixFS spec did not document this or its impact on implementations. - note after PBNode schema: field order is stricter than intuitive protobuf convention, decoders MUST accept both, encoders SHOULD use Links-before-Data per IPIP-499 profiles - warning in dag-pb Types section: streaming parsers cannot determine node type until after all links are read - test vectors: wire order annotations for directory and HAMT fixtures - appendix: historical context and Robustness Principle guidance - dag-pb spec reference updated to Wayback Machine snapshot
🚀 Build Preview on IPFS ready
|
achingbrain
left a comment
There was a problem hiding this comment.
Protobuf also allows the Data field to be in between Link messages, not sure if that's worth mentioning too.
Clarify that Links and Data must each appear as a contiguous group on the wire. Interleaving (e.g. Links, Data, Links) produces duplicate Links lists and decoders should reject it. Addresses PR feedback from @achingbrain.
|
good flag. i checked Go parser will return error "duplicate Links section", so better to lock this down at spec level as "not allowed", i think? @achingbrain clarified in 4a2539b, lmk if this sounds sensible (i marked it as SHOULD because its niche + unsure if we can enforce this behavior across implementations) |
|
It's valid protobuf so if it needs locking down |
|
we validate all our codec implementations against the links,data,links case, they all strictly disallow this |
It's the encoding that matters here, off-the-shelf parsers should be mostly fine in any case, it's that we don't want to see encoded forms like this show up anywhere or we're in for a world of pain. The general rule should be: the fewer variations for the same data the better. More determinism = more better for many use cases. More slop = more pain. So where we can avoid slop we should. |
In both Go and JS and legacy code,
dag-pbserializesLinksbeforeDataon the wire, but the UnixFS spec did not document this or its impact on implementations.This PR adds explicit mention of this, along with backward and forward related interop guidance. This is not spec change, just clarifying something that was already in the spec (protobuf order and test vectors already had Links first), just stating it explicitly for humans and LLMs to ensure both orders are parsed correctly, future-proofing the interop of our stack.
Thanks to @achingbrain for flagging this gap and ping @rvagg who did extensive cleanups of IPLD specs before
Changes
Related / CC