-
Notifications
You must be signed in to change notification settings - Fork 1
Provide tensor-friendly methods for torch and jax through OMEArrow.to_dlpack
#34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
ba8af01
add dlpack for zero-copy capability with tensors
d33bs 7c3b504
linting
d33bs ed1d437
address coderabbit review suggestions
d33bs a79a129
address coderabbit review suggestions
d33bs 948993c
address minor review comments
d33bs 39e22d7
address copilot review suggestions
d33bs a83c3c4
address coderabbit review suggestions
d33bs 2a69d24
address coderabbit review comments
d33bs 5f88a86
address coderabbit review suggestions; fix warns
d33bs 54980b7
Merge remote-tracking branch 'upstream/main' into tensors
d33bs 0a576df
linting
d33bs 7cebeb1
docs and hw / 2-dim optionality
d33bs 8117bde
linting
d33bs 1853b0e
docs clarifications on roi and canon dims
d33bs c171f15
accommodate 3d tiling and roi
d33bs 8435d80
optionality through chunk_policy
d33bs 0bfd90e
add docstrings where needed
d33bs dac7188
Update README.md
d33bs bacba37
linting
d33bs a7f55f9
add 3.14 to ci
d33bs File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,118 @@ | ||
| # Exporting OME-Arrow pixel data via DLPack | ||
|
|
||
| OME-Arrow exposes a small tensor view API for pixel data. The returned | ||
| `TensorView` can export DLPack capsules for zero-copy interoperability on CPU | ||
| and (optionally) GPU. | ||
d33bs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Key defaults: | ||
|
|
||
| - OME-Arrow tensor layouts always include channels (`C`) as a tensor axis. | ||
| - Default layout is `CHW` when both `T` and `Z` are singleton in the source. | ||
| - Otherwise, default layout is `TZCHW` (with singleton `T`/`Z` retained unless you override layout). | ||
| - You can override with any valid TZCHW permutation/subset, for example `HWC`, `ZCHW`, or `CHW`. | ||
|
|
||
| Layout nomenclature: | ||
|
|
||
| - `T`: time index | ||
| - `Z`: z/depth index | ||
| - `C`: channel index | ||
| - `H`: image height (Y axis) | ||
| - `W`: image width (X axis) | ||
|
|
||
| Practical mapping: | ||
|
|
||
| - 2D image content (`YX`) is typically exposed as `CHW`. | ||
| - 3D z-stack content (`ZYX`) is typically exposed as `ZCHW` or `TZCHW` (with `T=1`). | ||
| - Time-lapse and volumetric content use `TZCHW` by default. | ||
|
|
||
| ## PyTorch | ||
|
|
||
| ```python | ||
| from ome_arrow import OMEArrow | ||
|
|
||
| obj = OMEArrow("example.ome.parquet") | ||
| view = obj.tensor_view(t=0, z=0, c=0) | ||
|
|
||
| # DLPack capsule -> torch.Tensor | ||
| import torch | ||
|
|
||
| capsule = view.to_dlpack(mode="arrow", device="cpu") | ||
| flat = torch.utils.dlpack.from_dlpack(capsule) | ||
| tensor = flat.reshape(view.shape) | ||
| ``` | ||
|
|
||
| ## JAX | ||
|
|
||
| ```python | ||
| from ome_arrow import OMEArrow | ||
|
|
||
| obj = OMEArrow("example.ome.parquet") | ||
| view = obj.tensor_view(t=0, z=0, c=0, layout="CHW") | ||
|
|
||
| import jax.numpy as jnp | ||
|
|
||
| capsule = view.to_dlpack(mode="arrow", device="cpu") | ||
| flat = jnp.from_dlpack(capsule) | ||
| arr = flat.reshape(view.shape) | ||
| ``` | ||
|
|
||
| ## Iteration examples | ||
|
|
||
| ```python | ||
| from ome_arrow import OMEArrow | ||
| import numpy as np | ||
|
|
||
| obj = OMEArrow("example.ome.parquet") | ||
| view = obj.tensor_view() | ||
|
|
||
| # Batch over time (T) dimension. | ||
| for cap in view.iter_dlpack(batch_size=2, shuffle=False, mode="numpy"): | ||
| batch = np.from_dlpack(cap) | ||
| # batch shape: (batch, Z, C, H, W) in TZCHW layout | ||
| ``` | ||
d33bs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ```python | ||
| from ome_arrow import OMEArrow | ||
| import numpy as np | ||
|
|
||
| obj = OMEArrow("example.ome.parquet") | ||
| view = obj.tensor_view(t=0, z=0) | ||
|
|
||
| # Tile over spatial region. | ||
| for cap in view.iter_dlpack( | ||
| tile_size=(256, 256), shuffle=True, seed=123, mode="numpy" | ||
| ): | ||
| tile = np.from_dlpack(cap) | ||
| # tile shape: (C, H, W) in CHW layout | ||
| ``` | ||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## Ownership and lifetime | ||
|
|
||
| `TensorView.to_dlpack()` returns a DLPack-capable object (with `__dlpack__`) | ||
| that references the underlying Arrow values buffer in `mode="arrow"`, or a | ||
| NumPy buffer in `mode="numpy"`. Keep the `TensorView` (or any NumPy array | ||
| returned by `to_numpy`) alive until the consumer finishes using the DLPack | ||
| object. | ||
|
|
||
| `mode="arrow"` currently requires a single `(t, z, c)` selection and a full-frame | ||
| ROI. Use `mode="numpy"` for batches, crops, or layout reshaping beyond a simple | ||
| reshape. | ||
|
|
||
| Zero-copy guarantees depend on the source: Arrow-backed inputs preserve buffers, | ||
| while records built from Python lists or NumPy arrays will materialize once into | ||
| Arrow buffers. The same applies to `StructScalar` inputs, which are normalized | ||
| through Python objects before Arrow-mode export. | ||
| For Parquet/Vortex sources, zero-copy also requires the on-disk struct schema | ||
| to match `OME_ARROW_STRUCT`; non-strict schema normalization materializes via | ||
| Python objects. | ||
|
|
||
| ## Optional dependencies | ||
|
|
||
| CPU DLPack export uses Arrow buffers by default. For framework helpers and GPU | ||
| paths, install only what you need: | ||
|
|
||
| ```bash | ||
| pip install "ome-arrow[dlpack-torch]" # torch only | ||
| pip install "ome-arrow[dlpack-jax]" # jax only | ||
| pip install "ome-arrow[dlpack]" # both | ||
| ``` | ||
181 changes: 156 additions & 25 deletions
181
docs/src/examples/learning_to_fly_with_ome-arrow.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,4 +13,5 @@ caption: 'Contents:' | |
| maxdepth: 3 | ||
| --- | ||
| python-api | ||
| dlpack | ||
| ``` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.