Skip to content

docs(unixfs): document PBNode field order#531

Open
lidel wants to merge 2 commits intomainfrom
feat/unixfs-pbnode-field-order
Open

docs(unixfs): document PBNode field order#531
lidel wants to merge 2 commits intomainfrom
feat/unixfs-pbnode-field-order

Conversation

@lidel
Copy link
Member

@lidel lidel commented Mar 23, 2026

In both Go and JS and legacy code, dag-pb serializes Links before Data on the wire, but the UnixFS spec did not document this or its impact on implementations.

This PR adds explicit mention of this, along with backward and forward related interop guidance. This is not spec change, just clarifying something that was already in the spec (protobuf order and test vectors already had Links first), just stating it explicitly for humans and LLMs to ensure both orders are parsed correctly, future-proofing the interop of our stack.

Thanks to @achingbrain for flagging this gap and ping @rvagg who did extensive cleanups of IPLD specs before

Changes

  • note after PBNode schema: field order is stricter than intuitive protobuf convention, decoders MUST accept both, encoders SHOULD use Links-before-Data per IPIP-499 profiles
  • warning in dag-pb Types section: streaming parsers cannot determine node type until after all links are read
  • test vectors: wire order annotations for directory and HAMT fixtures
  • appendix: historical context and Robustness Principle guidance
  • dag-pb spec reference updated to Wayback Machine snapshot

Related / CC

…tions

dag-pb serializes Links before Data on the wire, but the UnixFS spec
did not document this or its impact on implementations.

- note after PBNode schema: field order is stricter than intuitive
  protobuf convention, decoders MUST accept both, encoders SHOULD
  use Links-before-Data per IPIP-499 profiles
- warning in dag-pb Types section: streaming parsers cannot determine
  node type until after all links are read
- test vectors: wire order annotations for directory and HAMT fixtures
- appendix: historical context and Robustness Principle guidance
- dag-pb spec reference updated to Wayback Machine snapshot
@lidel lidel requested review from achingbrain and rvagg March 23, 2026 16:05
@github-actions
Copy link

github-actions bot commented Mar 23, 2026

🚀 Build Preview on IPFS ready

Copy link
Member

@achingbrain achingbrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protobuf also allows the Data field to be in between Link messages, not sure if that's worth mentioning too.

Clarify that Links and Data must each appear as a contiguous
group on the wire. Interleaving (e.g. Links, Data, Links)
produces duplicate Links lists and decoders should reject it.

Addresses PR feedback from @achingbrain.
@lidel
Copy link
Member Author

lidel commented Mar 25, 2026

good flag. i checked Go parser will return error "duplicate Links section", so better to lock this down at spec level as "not allowed", i think?

@achingbrain clarified in 4a2539b, lmk if this sounds sensible (i marked it as SHOULD because its niche + unsure if we can enforce this behavior across implementations)

@achingbrain
Copy link
Member

It's valid protobuf so if it needs locking down SHOULD is probably as far as it needs to go - otherwise it prevents use of off the shelf parsers.

@rvagg
Copy link
Member

rvagg commented Mar 26, 2026

https://github.com/ipld/codec-fixtures/blob/master/negative-fixtures/dag-pb/decode/edges.json#L43-L45

we validate all our codec implementations against the links,data,links case, they all strictly disallow this

@rvagg
Copy link
Member

rvagg commented Mar 26, 2026

prevents use of off the shelf parsers

It's the encoding that matters here, off-the-shelf parsers should be mostly fine in any case, it's that we don't want to see encoded forms like this show up anywhere or we're in for a world of pain. The general rule should be: the fewer variations for the same data the better. More determinism = more better for many use cases. More slop = more pain. So where we can avoid slop we should.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants