Open Voxel File Format Spec ideas #2

dougbinks · 2025-11-25T10:56:22Z

dougbinks
Nov 25, 2025
Maintainer

A short discussion in the Vengi discord led to me mentioning on the VoxelGameDev subreddit that:

I'm hoping to create an open Voxel interchange/transfer format. My thoughts are that I would make this based on gltf's json format for everything but the raw voxel data, which would likely be in RLE encoded binary arrays. It would be great if it could support different bits per voxel (at least 8 and 16) and perhaps other data than just material indices (Avoyd has 8bit density), but that would be a stretch goal.

If anyone's interested in following along with this let me know. I'm thinking of starting by making a spec sheet on Github, and will notify any folk who reply when I get around to putting the first draft up.

Essentially a number of tools such as Vengi and my own Avoyd have outgrown the MagicaVoxel .vox format, and we need something else. This is the place for this discussion and perhaps an initial implementation.

dougbinks · 2025-11-25T11:10:57Z

dougbinks
Nov 25, 2025
Maintainer Author

My thoughts on requirements (not set in stone):

As simple as possible - implementations should be able to be lightweight
Open and extensible
Arbitrary transforms
PBR material properties including those needed for transmissive volumes
Indexed material voxels
No limitation on size of voxel volumes
Stretch goal: other per voxel properties
Animation
Relatively efficient but simple storage and serialization performance

2 replies

JuanDiegoMontoya Nov 25, 2025

As a game dev (not someone making editors or sims), I agree with these requirements. I'd like to expand on some of these points to add what I'd like to see specifically:

Like glTF, a scene should consist of a node hierarchy, where nodes can optionally point to voxels.
Agree, though see my next point.
How should voxels map to materials? Is a voxel just a material index? What about physics sims or other situations in which it's more efficient to store data (such as velocity or density) directly in the voxels themselves?
a. Voxel games render voxels in a variety of ways, from simple cubes to isosurfaces to little meshes (e.g. Minecraft fence) or even small blocks (my game does this). Do we care to support these? I would argue that it is important to have similar visual fidelity between applications, but it could be difficult to support more than a basic rendering scheme, in which case it may be better to not embed this in the format, or to leave it as something to put in an extension.
I agree and think that both sparse and dense voxel representations should be supported as neither one is clearly more optimal than the other in most situations.

I'll add that it'd be nice to make as many aspects of this format ignorable as possible, to simplify the process of writing an importer or exporter for an application that only cares about a subset of the features. For example, my game uses .vox files to represent block materials. In this application, I only care about a single volume, its size, and the material palette, but not the transform hierarchy.

dougbinks Nov 25, 2025
Maintainer Author

5. How should voxels map to materials? As the conversation with @mgerhardy below I think my stretch goal 7 is probably a requirement, i.e. we may need a set of extendable attributes and types similar to how glTF manages mesh data.

For your block materials .vox might be the simplest option, but just loading the first models of this format without their scene (as you likely do with .vox) should be made pretty simple by design.

dougbinks · 2025-11-25T11:23:02Z

dougbinks
Nov 25, 2025
Maintainer Author

My thoughts on potential implementations based on other file formats (an edit of what I posted on r/VoxelGameDev)

The format is intended for complete scenes as well as models.

There are two main transfer formats in the gamedev space, glTF and Universal Scene Description (USD), and for Voxel formats the MagicaVoxel .vox format.

MagicaVoxel .vox

This is a relatively simple format which is extensible, however in my view it does not meet the requirements I have:

Proprietary: ephtracy could change the format in a way which breaks our extensions
Scene hierarchy has an implicit pivot at the centre of the voxel bounds
Limit on size of voxel volumes as encoded as 8bit positions and 8bit material id
No arbitrary transforms
Material properties are somewhat esoteric
Limited material palette (only 256 entries)
Animation is integer frame based

glTF

glTF is fairly simple. It has some issues dealing with very large scenes (the binary .glb format is limited to 32bit size), but it is extensible so this could be worked around or external assets used. There's a fair amount of tooling around glTF, and hierarchies plus animations should work with this tooling even if they can't load the voxel data extension components.

glTF does not have any volume description, however it has extensions for volumetric parameters which are applied to the interior of meshes such as index of refraction and transmission/attenuation. We can reuse these, and add extensions for other parameters (scattering etc.).

USD

USD is more complicated, but more powerful.

USD has volumes usdVol which are usually represented by an OpenVDB asset. Technically with USD one could likely throw a load of open source code at the problem and with a 40MB library (compiled) solve the problem without any further work than specifying how to handle indexed OpenVDB materials (OpenVDB can handle integer voxels).

Conclusion

I personally favour lightweight code, and thus glTF. There are simple C and C++ libraries for using it, and the base information is very simple JSON.

On the other hand USD is far more powerful and already has volumes.

Your thoughts are welcome below

4 replies

mgerhardy Nov 25, 2025

it would be nice to also support RGBA as voxel data and omit the palette. There are formats out there that are storing RGBA values already. Everything in Qubicle for example (https://gist.github.com/tostc/7f049207a2e5a7ccb714499702b5e2fd). So I would say, 32bits and 24bits are also a must have if you don't want to lose data when converting from one format into another.

dougbinks Nov 25, 2025
Maintainer Author

Agreed, that would come under Stretch goal: other per voxel properties so I would guess we might need to move that to an actual goal?

I was considering using something similar to glTF's attributes (which are like OpenGL/Vulkan ones), though I'm unsure if we want the full attribute->accessor->bufferView indirection.
https://github.com/KhronosGroup/glTF-Sample-Models/blob/main/2.0/Box%20With%20Spaces/glTF/Box%20With%20Spaces.gltf#L51-L54

We certainly want the ability to have a use and a type, i.e.:
Usage, number, type
RGBA_COLOR, 4, UNSIGNED_BYTE
MATERIAL_INDEX, 1, UNSIGNED_BYTE
MATERIAL_INDEX, 1, UNSIGNED_SHORT

Longor1996 Nov 26, 2025

So, material/property/attribute... channels? I prefer 'channel', so I'll use that here.

A given channels concrete data can be stored either separated or interleaved; so the glTF concept of accessor/view/buffer can be (re)used here... though that doesn't seem to allow tightly packing bits?

EDIT: A given set of channels could be defined either globally or locally (per volume).

For channel base types, there's quite a few of possibilities:

Unsigned integers of size 1, 2, 4, etc. (n²), which can serve as palette indices.
Signed integers of various sizes.
Floating-point numbers of size 16/32/64.
Fixed-point numbers of various sizes/precision.

Then we have channel usage types:

Colors and their components (RGB/A, OKLAB, CMYK, whatev's).
Physical materials (PBR components).
Physics materials (weight, pressure, etc.).
Vectors (normals, tangents, etc).
Scalars (obvious).
Indices (for palettes).
???

These ^ can probably be simplified/seaprated a bunch.

And regarding palettes:

There can be any number of palettes.
A palette can be global, shared among many volumes.
A palette can {life with / be part of / be stored with}, a given finite volume (chunk/brick).
A palette can be concrete (i.e.: a list/map/set) or sampled/synthetic (gradients, textures, functions(?)).
The entries of a given palette have the same type, but that type may be literally anything (like, whole JSON blobs).

(I definitively forgot some stuff)

dougbinks Nov 27, 2025
Maintainer Author

Yep these all sound good. We'll likely start with a subset and build from there.

Longor1996 · 2025-11-25T13:01:21Z

Longor1996
Nov 25, 2025

Neat! Let me just quickly dump a few (very shortened) points from my previous notes on this idea...

EDIT: Copying without consistency/duplicate checking t'was a bad idea; see replies.

...
Mutability / editing enforces some database-ish properties, exposing issues like integrity.
Efficient indexing and access of subvolumes (ie: non-finite/too-big-for-RAM volumes).
- How do you define a volume index? It almost directly depends on the defined storage hierarchy.
- This makes glTF by itself non-viable, as it can't efficiently store a large index.
- ...
Paletted and non-paletted storages (ie: "tile entities", bags-of-properties).
Non-discrete voxels (ie: isosurfaces, transvoxel, dual contouring)
Multiple layers (ie: think fluids, wires, covers, etc) and thus multiple palettes.
Temporal Dimensions (ie: history of changes and/or replays).
Alternate Dimensions (ie: overlapping worlds, possibly scaled, like Overworld/Nether in MC)
Higher Dimensions (ie: literally 4/5d).
Holes & Sparsity (ie: infinite world generation).
Incompleteness (ie: phased world generation).
Arbitrary spatial depth/scale (ie: octrees).
Different data layouts over depth/location.
Compression methods, possibly layered (ie: RLE, VARINT, DEFLATE, ZSTD, etc.).
...

2 replies

dougbinks Nov 25, 2025
Maintainer Author

I think some of this is out of scope of a file format, and some of this is out of scope due to complexity. This isn't an attempt to create a universal file format for all voxel games, it's a file format for transferring voxel content between tools.

In detail:

Mutability / editing: I think this is out-of-scope. This is for transferring content and not meant as the primary means of storage.
Efficient indexing: I think non-finite is out-of-scope but we may need 64bit size/indexing, though volumes will likely be constrained to 32bit integer sizes.
Non-discrete voxels As mentioned we should be able to support other parameters such as floating point, though these will likely not be well supported by all tools and OpenUSD with OpenVDB may be a better approach.
Multiple layers This is either multiple scenes or models and thus supported that way. Whether we explicitly add layer support is TBD.
Temporal Dimensions I think this is out-of-scope but could be handled by multiple scenes.
Alternate Dimensions I think this is out-of-scope but could be handled by multiple scenes.
Higher Dimensions I think this is out-of-scope.
Holes & Sparsity (ie: infinite world generation). Infinite worlds are out-of-scope but holes and sparsity can be handled by multiple models under one node.
Incompleteness (ie: phased world generation). I think this is out-of-scope. The file represents a moment in time.
Arbitrary spatial depth/scale (ie: octrees). I think this is out-of-scope.
Different data layouts over depth/location. I'm not sure what this means exactly.
Compression methods, possibly layered (ie: RLE, VARINT, DEFLATE, ZSTD, etc.). I think we will likely require some form of simple compression to be specified in the format due to how large even simple volumes can get. TBD.

Longor1996 Nov 25, 2025

Woops, sorry! Copying old notes over without checking them was a bad idea. :'D

Mutability: Agreed, doesn't matter for interchange.
Indexing (mixed up with sparsity): Yes, indexing needs to be 64-bit (allow different modes?)~~, even if it's just for "woops big coordinates"~~.
Non-discrete: Yeah, this is a question of "could be handled by OpenVDB", so not sure if necessary, but could be handled via a channel mechanism anyhow.
Multiple layers: This should, at the very least, be allowed "per model" as per / via a channel mechanism.
The dimension-y stuff was just there to showcase how many different 'axes'/dimensions volumes can have... so yes, that can be handled by multiple scenes / linked scenes; discard that.
Holes/Sparsity: Yes, this can be handled by multiple models per node, but ought to have a standard as to how, in some way.
Incompleteness: Especially if/because a file represents a moment in time, it should be possible to add/mark this.
Arbitrary Spatial Depth/Scale: The scene description/graph allows for this, so... not really out-of-scope, just defined differently?
Different data layouts over depth/location: This can-be/is handled by a channel mechanism; disregard.
Compression Methods: Yes.

To summarize:

Scene hierarchy/graph handles a lot of it.
Alternate/linked models handle yet more.
Channel mechanism / voxel properties (if allowed per model) covers the rest(?).

The only thing I can't agree with, is not having a way to define a temporal dimension... animations are a common thing to want to exchange after all! Forcing that over multiple files will cause quite a lot of copying, when only small bits of the volume(s) actually change.

EDIT: (animation) ...Apparently I can't read today.

EDIT 2: (indexing) ...nor write without mixing up more things. Sorry.

Thibaulltt · 2025-11-25T13:58:49Z

Thibaulltt
Nov 25, 2025

What's the goal of this new format ? Is it:

having a new open format with good reference implementations (all for it, but shouldn't be the only thing) ?
to have a really compact voxel representation on disk (i.e. having compression algorithms in the format as PNG/MPEG) ?
to have a human-readable format for the metadata ?
something else ?

3 replies

dougbinks Nov 25, 2025
Maintainer Author

As per the first sentence I'm hoping to create an open Voxel interchange/transfer format the goal is an open format for transferring models and scenes between voxel editors and engines.

It's not a goal to have a really compact voxel representation on disk, nor to have a human-readable format for the metadata.

Longor1996 Nov 25, 2025

nor to have a human-readable format for the metadata.

That seems backwards? Part of what makes glTF (which is mostly metadata) being JSON so useful, is that it's trivial to parse by humans and machines.

And if not JSON (or similar), what else? Is defining an entire tagged-binary-tree (or similar) format, just to store metadata with it, a good use of time/effort?

dougbinks Nov 25, 2025
Maintainer Author

So it's not a goal to be human readable, nor is it a goal to be not human readable.

If we use glTF the metadata would be human readable (though just barely as indices are used rather than names so it's somewhat awkward to parse by human).

AstrorEnales · 2025-11-25T14:19:10Z

AstrorEnales
Nov 25, 2025

First of all, thank you for initiating this discussion and i'm all for it! I already agree with many of the points made.
Regarding animations, as this is probably the most important to me, i would advocate for:

Supporting frame-based animations
Supporting bone/armature definitions like glTF with animation tracks
Supporting multiple animations, not just one

Those are just some quick things immediately coming to my mind.

0 replies

dougbinks · 2025-11-25T14:41:33Z

dougbinks
Nov 25, 2025
Maintainer Author

An extra note here on things I've read here/elsewhere:

If we were to extend something like glTF then with the right attributes a voxel scene could be made up of voxel models containing integer indices which reference a mesh model, thus supporting models similar to something like Minecraft's terrain.
Similarly a voxel could index another voxel model, and so support scenes from games like VoxIsle where every block is a .vox model.

2 replies

Longor1996 Nov 25, 2025

Possibly relevant to (1): KhronosGroup/glTF#2496

(should this be a separate topic/comment?)

dougbinks Nov 25, 2025
Maintainer Author

No, that's a good thing to add here if we end up using glTF.

mgerhardy · 2025-11-25T15:58:20Z

mgerhardy
Nov 25, 2025

Just also dumping this here:

https://pro.arcgis.com/en/pro-app/latest/help/mapping/layer-properties/supported-voxel-formats.htm

The time components on voxel layers is also kind of interesting.

Also this documentation might give some general ideas: https://eisenwave.github.io/voxel-compression-docs/

And also the way to store the palette must be defined - colors could have names, materials and so on.

2 replies

dougbinks Nov 25, 2025
Maintainer Author

For materials if we use the glTF approach then this is an array of named materials, so we can either use array index or a separate palette of names per model, perhaps using different attribute names to clarify the difference.

Longor1996 Nov 26, 2025

Do note that, while the Eisenwave Voxel Compression page states that:

The goal of the project is to develop an efficient algorithm for compressing voxel models.

The input format it uses to evaluate said compression methods is... unfortunate, to say the least:

As an input format, an unsorted list of 3D integer coordinates and attribute data is used.

Pretty much the same concept that MagicaVoxel uses; effectively point clouds.

scallyw4g · 2025-11-26T00:10:51Z

scallyw4g
Nov 26, 2025

Hi Doug, thanks for starting this conversation.

From your list of initial bullet points, the first two really stand out to me.

As simple as possible - implementations should be able to be lightweight
Open and extensible

I propose that these two requirements could form the basis of a successful format. We've seen open formats dominate in the past; glTF, OpenGL and Vulkan being good examples.

Another comment in the thread that jumped out me was:

I'll add that it'd be nice to make as many aspects of this format ignorable as possible, to simplify the process of writing an importer or exporter for an application that only cares about a subset of the features.

I think this is a useful idea, and makes for an attractive method of piece-wise writing reference implementations, and the specification. My preferred method of development looks something like the following:

Synthesize ideas for implementation
Write reference implementation
Formalize Specification based on learnings from reference implementation

I've used this method for writing compilers (with formal specifications) in the past, with good success.

In terms of the actual format discussion, there does exist a fourth option, which is to define our own custom format. My read on the options presented so far is as follows:

MagicaVoxel. Proprietary, several glaring technical limitations; worst of the three for me.
glTF. Format is relatively simple, extensible, and has a well supported ecosystem of tooling surrounding it. Best candidate.
USD / OpenVDB. These formats look to be (for my purposes at least) far too general/complicated.

Of the above options, (2) glTF seems like the best candidate to me. This comes with two large caveats; first, glTF appears to broadly operate under the assumption that it is a wire-transfer format, and will be gzip/deflate'd. It does not appear to have any native notion of compression, which we need. Secondly, after briefly reading the specification, I propose that we would not actually re-use much of the existing glTF spec, and instead we would end up stuffing our own specification format to fit inside glTF-shaped containers. While this is not a deal-breaker, it does sound to me like we may be introducing significant friction into our lives, for yet-unknown benefit. I could be wrong about this; I only read the spec for 30 minutes, but that's my initial read.

Briefly, it appears we may be able to reuse the following features to store arbitrary voxel data & scene descriptions.

Accessors & Buffers
Animation ..? (unclear to me if this could be used)
Asset (unclear, but possibly necessary)
Camera
Material
Scene/Node
Texture

I also read through the EXT_primitive_voxels proposed spec linked above and .. it looks pretty watery to me. It does appear to specify something along the lines of what we're discussing in this thread, but so far looks pretty half-baked, with some academic-looking frills that make me a bit nervous (the implicit volume stuff).

The real boon that I see glTF providing is that if we adopt that format, the barrier for applications to bake the voxel representation of the world into traditional triangulated 3D meshes, textures, etc is much lower. The resulting exports could be loaded into traditional 3D editors (Maya, Blender, etc). That said, then we're more talking about an export target for applications, and not necessarily a voxel data interchange format. That observation makes the value proposition of using something like glTF basically boils down to potentially sharing a parser with another export path, which isn't nothing, but isn't much either.

--

So.. that all said, I would propose adding a "novel voxel format", option to the list of candidates, with the following qualities:

Pros

Maximum flexibility/control for our specific use case

Cons

More decisions to be made
No existing libraries for parsing (although we could just say it's JSON, which mostly solves this)

Thoughts?

I'm between jobs right now and could spend a day to write a simple .vox->.shiny_new_format utility at some point, if you're interested in code contributions.

13 replies

Jaisiero Nov 26, 2025

I was referencing to points 2 and 3 of his post:
#2 (comment)

I believe that voxel models can be fully animated with those techiniques like a mesh can do. We just lack the format and the tools.
I'll keep an eye on the evolution of this project and see if I can extend it to support animations in the future because right now there's no voxel format that supports this feature AFAIK.

Jaisiero Nov 26, 2025

Point 1 is useful for rendering but it's not useful for gameplay since there's no voxel tracking between frames thus if model is altered, temporal consistency is lost. On the other hand, with bone base voxel animation, that doesn't happen and you can achieve what mesh models can't, model alteration and animations at the same time.

dougbinks Nov 26, 2025
Maintainer Author

I think those interested in frame-based animations are probably after non-destructible models.

AstrorEnales Nov 26, 2025

For me, bones allow several features to be realised:

Having a frame-based animation, bone animations can provide additional points of reference. For example in engine, they could provide a means to add different objects to a hand and move them accordingly.
Bone animations can also supply similar capabilities as object transform animations
Having a static voxel model, bone animations can be used to play voxel animations that are not constrained to a voxel grid, like minecraft, etc.
Having a static voxel model, bone animations can also be used to play grid-constrained voxel animations as is the primary feature of AniVoxel

At least for the last two points, as was already mentioned, we would need some kind of weight assignment.

dougbinks Nov 26, 2025
Maintainer Author

I'm pro the idea of using bone animations, but I won't be able to do anything with voxel weights for some time. However as above this should be able to be added.

Longor1996 · 2025-11-26T12:28:43Z

Longor1996
Nov 26, 2025

This is most certainly pointless bikeshedding for now, but:

Since the format is mainly intended for the 'transfer' of voxel(scene)s, similar to glTF, would it make sense to call the format vxTF?

1 reply

dougbinks Nov 26, 2025
Maintainer Author

Technically I think it would still be a glTF file with extensions, but we could use a name like vxTF to differentiate it from ones without.

dougbinks · 2025-11-28T11:45:03Z

dougbinks
Nov 28, 2025
Maintainer Author

I have modified the Readme.md to add Requirements. For changes to these add an issue or PR as this discussion is a bit complex.

I've opened an issue for finalizing the Implementation decision: modify existing format or custom.

I will then open an issue to discuss the binary voxel data format once I have time to put down some ideas.

0 replies

dougbinks · 2025-11-28T14:48:59Z

dougbinks
Nov 28, 2025
Maintainer Author

There's now a Poll on the implementation decision to modify existing format or custom and I've added an issue to discuss the Voxel Data Format

0 replies

apps4uco · 2025-12-04T16:31:13Z

apps4uco
Dec 4, 2025

I'd like to add a suggestion of 2 options:

Parquet

The Parquet file format already has internal compression, is extensible in terms of columns that can be added, and is easily converted to the Arrow format for processing, either in memory, streaming from disk or from a memory mapped file. In any case a format that is organized as Structure of Arrays (SOA) ie each type of data is consecutive in memory/on disk is usually allows better compression and also is faster to process using SIMD/GPU than an Array of Structures. If the parquet file cannot be adapted for this use case at least there are many interesting features of the format that could be used.

Tiff

Multi-image TIFFs, with either subfiles and image file directories, also allow unlimited extra channels, the advantage here is that it is already an image format and is designed to be extensible, https://libtiff.gitlab.io/libtiff/multi_page.html It should be possible to design a file format that is can be opened by existing software, and has extra metadata for 3d interchange and visualization. Tiff can also be memory mapped to enable random access reading without consuming GB of RAM.

1 reply

dougbinks Dec 4, 2025
Maintainer Author

Thanks for these ideas.

Note that the file format we're talking about here is the scene description format, and I don't think either Parquet nor Tiff are suitable for that.

For the voxel data, both formats are also fairly complex, so I don't think they meet our first requirement: 1. As simple as possible - implementations should be able to be lightweight.

PixelDust22 · 2025-12-10T10:42:15Z

PixelDust22
Dec 10, 2025

This is a horrible idea.

I appreciate the effort in proposing an open standard for voxel formats, as interoperability is a worthy goal. However, I believe this initiative is premature and, in its current form, risks becoming "yet another standard that no one will use."

The fundamental issue is that the solution space for efficient, versatile voxel storage is currently too large and too fragmented. We simply have not converged on a set of dominant patterns that effectively balance all competing requirements. A standard defined now, without the backing of a mature, iterated, and proven implementation, will inevitably fail to support critical use cases.

Scene Description

The problem of scene description is already solved by multiple robust, engine-agnostic standards. For engine-agnostic asset and scene loading, why not utilize an existing, widely adopted standard? A glTF extension or an OpenUSD plugin would be vastly superior and immediately integrate with existing mesh-based tooling pipelines, allowing someone to mix mesh and voxel models. Emerging solutions like Bevy's BSN also exist.

The requirements listed for scene description are well-covered by the industry; adding a voxel extension for one of those scene description formats would be trivial.

And then there's an entirely different camp of people who believe that there's no need for a scene description format whatsoever, because they're building a minecraft game and there isn't a concept of a scene because everything is just chunks.

Voxel Data: The Core Problem Lacks Convergence

This is the "meat" of the problem, but there are fundamental, unsolved trade-offs that require extensive real-world iteration before a standard can be defined.

Geometry & Storage Trade-offs

The geometric representation alone has a vast landscape of options, each with profound implications for performance, memory, and disk usage:

Geometry Options: Implicit grids, 3D mipmapped textures, explicit coordinate lists, Octrees, OpenVDB, SVDAGs, Symmetry-aware SVDAGs, etc. Should the format optimize for supercompression and small disk utilization, or should the on-disk and in-memory formats be identical for optimal loading speeds, which is critical for open-world games? With current VRAM caps (e.g., 16GB / 32GB), the in-memory format is often the main bottleneck, which makes a trade-off heavily favoring fast loading more appealing than absolute disk size for open world games, which is much cheaper and easier to add. A standard created today will inherently make an arbitrary choice that will immediately make it unusable for anyone prioritizing the opposite requirement.

Material Options: Minecraft-style texture mapping, MagicaVoxel-style palette indexing, block-compressed voxel attributes, some attribute mapping scheme that implicitly derive attributes from geometry, or a mix of the above where you store attribute coordinates somewhere in the geometry then derive some other parts of the attribute coordinates. Should geometry and material be tightly coupled (e.g., deriving attributes from the voxel's geometric structure) for storage efficiency, even though it makes iteration and modification much harder?

No one knows the best compromise yet. The person or team who will define the authoritative voxel data format standard will be the one who demonstrates a good balance between these competing priorities through extensive, successful iteration and a performant, shipped implementation. If that implementation sees widespread adoption, its standard will be consumed by multiple clients. It would then be trivial to integrate that into something like glTF.

If multiple implementations converge to the same solution, that's when we can start talking about actually standardizing the underlying voxel data storage format. This may never happen if the solution space never converges and that's also fine - see all the GPU texture formats out there: dds, ktx, pvr, pkm, etc.

Once a dominant, proven format emerges, integrating it into existing standards like glTF or OpenVDB will be a trivial, downstream task. Defining the standard before that implementation exists is putting the cart before the horse.

11 replies

PixelDust22 Dec 10, 2025

If that is all you want, you would just implement it based on whatever your priorities are, then publish the specifications. There's no need to ask for community input. And if it was shown that the format strikes a good balance between competing factors, people will adopt it automatically.

MagicaVoxel is an open file format under your definition and checks most of the boxes that you have specified. The only thing that's missing is a standard committee and a way to propose changes / extensions. But that's also what's gonna be missing from your proposed "open format".

Proprietary: ephtracy could change the format in a way which breaks our extensions

Without a standard committee you can also change the format in a way that breaks other peoples extensions.

Scene hierarchy has an implicit pivot at the centre of the voxel bounds

This was indeed underspecified but not a dealbreaker; it'll be far easier to write this into the MagicaVoxel specification.

Limit on size of voxel volumes as encoded as 8bit positions and 8bit material id

I can extend that size by having multiple 256x256x256 models in a scene graph. It's far easier to extract sparsity that way because my voxel coordinates are smaller. If I really want 16bit positions and 16bit material id, that's an easy enough extension to add.

No arbitrary transforms

Easy enough extension to add.

Material properties are somewhat esoteric

The MATL chunk is just a dictionary. Maybe define a your own MATL_ Avoyd chunk.

Limited material palette (only 256 entries)

Easy enough to make it longer.

Animation is integer frame based

glTF already define animation of node properties.

PixelDust22 Dec 10, 2025

So yeah, my point here is that, if you want a better MagicaVoxel format, we can come up with a way to standardize extensions to MagicaVoxel. If you want MagicaVoxel but in glTF, I'm also down for that. But to make this work you really need to:

Make it abundantly clear that this is for MagicaVoxel-like voxel datasets, where voxels are represented as an explicit list of block coordinates of constrained sizes (e.g. 256x256x256); models larger or denser than what this can effectively capture will be outside the scope. I think the solution space is well-defined on that front.
Commit to forming a standard body, because otherwise it'll just be considered the Avoyd format or the Avoyd glTF extension, and other people will refuse to build on top of it for the exact same reasons that you're refusing to extend the MagicaVoxel format.

dougbinks Dec 10, 2025
Maintainer Author

I'm unsure what your experience with open source software is since your profile is private, however my experience with open source software development has been that is possible to collaborate on the development of specifications and software with developers without forming standards bodies.

Your points about extending .vox are reasonable, but if you read through this discussion, the issues and readme you'll understand some of our reservations about extending it.

PixelDust22 Dec 10, 2025

There are good reasons for me to keep my profile private, but to the extent that makes this conversation relevant, I'm the maintainer of the dot_vox crate and I have extensive experience working on voxel renderers in the bevy community.

And again, I appreciate the fact that you have started this conversation. But when you work with a diverse group of developers, people will have competing priorities, and it will be impossible to get everyone to work together without some kind of standard body and a set of procedures to propose and ratify changes. It doesn't have to be a formal one with full time staff like Khronos; but people need to be assured that breaking changes wouldn't be made to the standard without approval from most stakeholders, and that they can propose changes without being ignored. Without that, this wouldn't be an open file format; this would just be yet another format. People will have the same reservations as you do for the MagicaVoxel format today.

Which goes back to my initial point which is that any open standard needs to have a lowest common denominator and a mechanism for extensions.

The "lowest common denominator" in our case is the existing MagicaVoxel format. You have to specify that as a baseline, the new format aims to support the capabilities that MagicaVoxel supports today, and nothing else - it's out-of-scope to support datasets with sizes significantly above that, and not required for implementations to support anything beyond 256x256x256 palette indexed models in a scene graph.

Then you need to establish a set of procedures for extensions. It doesn't need to be like Khronos; it could just be an RFC procedure with a centralized repository for proprietary extensions. Once enough implementations agree to support a certain extension, that extension can be promoted. Then you iterate on that extension until stakeholders are convinced that this is the right extension to be ratified - only then will it be merged into the core specification as an optional feature. Then maybe after many years, there isn't a relevant implementation on the market that doesn't support the feature - that's when you make it a mandatory feature. See https://github.com/KhronosGroup/glTF/blob/main/extensions/README.md for an example of how this works for Khronos.

If it isn't a goal to do both of these two things, please rename this project into something else because it would be as much of an open file format as MagicaVoxel is.

dougbinks Dec 11, 2025
Maintainer Author

It doesn't have to be a formal one with full time staff like Khronos; but people need to be assured that breaking changes wouldn't be made to the standard without approval from most stakeholders, and that they can propose changes without being ignored.

This is not my experience.

People will have the same reservations as you do for the MagicaVoxel format today.

The MagicaVoxel format is in no way open. Ephtracy has never asked anyone if they would like to contribute or discuss changes or features, and as far as I can tell they do not contribute to any open source software.

Which goes back to my initial point which is that any open standard needs to have a lowest common denominator and a mechanism for extensions.

This is what we are doing, though our lowest common denominator here is wider than that which you assume.

The "lowest common denominator" in our case is the existing MagicaVoxel format. You have to specify that as a baseline, the new format aims to support the capabilities that MagicaVoxel supports today, and nothing else - it's out-of-scope to support datasets with sizes significantly above that, and not required for implementations to support anything beyond 256x256x256 palette indexed models in a scene graph.

We intend to support datasets with sizes beyond what the MagicaVoxel format supports, and ones which include more than palette indexed models.

If it isn't a goal to do both of these two things, please rename this project into something else

I feel the project title is reasonable given our aims, and I don't think arguing about the name is helpful at this stage.

Open Voxel File Format Spec ideas #2

Uh oh!

Uh oh!

dougbinks Nov 25, 2025 Maintainer

Replies: 13 comments · 41 replies

Uh oh!

Uh oh!

dougbinks Nov 25, 2025 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

dougbinks Nov 25, 2025 Maintainer Author

Uh oh!

Uh oh!

dougbinks Nov 25, 2025 Maintainer Author

MagicaVoxel .vox

glTF

USD

Conclusion

Uh oh!

Uh oh!

Uh oh!

dougbinks Nov 25, 2025 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

dougbinks Nov 27, 2025 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

dougbinks Nov 25, 2025 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dougbinks Nov 25, 2025 Maintainer Author

Uh oh!

Uh oh!

dougbinks Nov 25, 2025 Maintainer Author

Uh oh!

Uh oh!

dougbinks Nov 25, 2025 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

dougbinks Nov 25, 2025 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

dougbinks Nov 25, 2025 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

dougbinks
Nov 25, 2025
Maintainer

Replies: 13 comments 41 replies

dougbinks
Nov 25, 2025
Maintainer Author

dougbinks Nov 25, 2025
Maintainer Author

dougbinks
Nov 25, 2025
Maintainer Author

dougbinks Nov 25, 2025
Maintainer Author

dougbinks Nov 27, 2025
Maintainer Author

dougbinks Nov 25, 2025
Maintainer Author

dougbinks Nov 25, 2025
Maintainer Author

dougbinks Nov 25, 2025
Maintainer Author

dougbinks
Nov 25, 2025
Maintainer Author

dougbinks Nov 25, 2025
Maintainer Author

dougbinks Nov 25, 2025
Maintainer Author