Skip to content

refactor: one array class#4034

Open
d-v-b wants to merge 38 commits into
zarr-developers:mainfrom
d-v-b:one-array-class
Open

refactor: one array class#4034
d-v-b wants to merge 38 commits into
zarr-developers:mainfrom
d-v-b:one-array-class

Conversation

@d-v-b
Copy link
Copy Markdown
Contributor

@d-v-b d-v-b commented Jun 4, 2026

This PR makes 2 fundamental changes to our top-level Array class:

  • it adds Array._runner: Runner to the Array. Runner is a protocol that looks like this:

    class SyncRunner:
        """The default `Runner`. Runs coroutines on Zarr's shared background event
        loop via `sync`.
        """
    
        def run(self, coro: Coroutine[Any, Any, T]) -> T:
            return sync(coro)

    the runner parameter allows a caller to provide their own event loop that the array will use when blocking on the execution of a coroutine. If the user doesn't declare a runner, we use a house default, which is just sync. So if you don't request a different runnner, everything is the same.

  • it adds async methods for every sync method. the sync methods use self.runner.run(self.do_thing()) to run

This means the AsyncArray class has no use and can be phased out. NOTE: it is not removed.

The goal here is no breaking changes. Removing the AsyncArray class can happen at its own pace. If you find any breaking changes in this PR, we can fix them.

dependabot Bot and others added 21 commits May 31, 2026 19:28
…#176)

Bumps the actions group with 8 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [prefix-dev/setup-pixi](https://github.com/prefix-dev/setup-pixi) | `0.9.5` | `0.9.6` |
| [codecov/codecov-action](https://github.com/codecov/codecov-action) | `6.0.0` | `6.0.1` |
| [github/issue-metrics](https://github.com/github/issue-metrics) | `4.2.2` | `4.2.7` |
| [j178/prek-action](https://github.com/j178/prek-action) | `2.0.3` | `2.0.4` |
| [actions/upload-artifact](https://github.com/actions/upload-artifact) | `7.0.0` | `7.0.1` |
| [actions/download-artifact](https://github.com/actions/download-artifact) | `7.0.0` | `8.0.1` |
| [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) | `1.13.0` | `1.14.0` |
| [zizmorcore/zizmor-action](https://github.com/zizmorcore/zizmor-action) | `0.5.3` | `0.5.6` |



Updates `prefix-dev/setup-pixi` from 0.9.5 to 0.9.6
- [Release notes](https://github.com/prefix-dev/setup-pixi/releases)
- [Commits](prefix-dev/setup-pixi@1b2de7f...5185adf)

Updates `codecov/codecov-action` from 6.0.0 to 6.0.1
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@57e3a13...e79a696)

Updates `github/issue-metrics` from 4.2.2 to 4.2.7
- [Release notes](https://github.com/github/issue-metrics/releases)
- [Commits](github-community-projects/issue-metrics@c9e9838...1e38d5e)

Updates `j178/prek-action` from 2.0.3 to 2.0.4
- [Release notes](https://github.com/j178/prek-action/releases)
- [Commits](j178/prek-action@6ad8027...bdca6f1)

Updates `actions/upload-artifact` from 7.0.0 to 7.0.1
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@v7...043fb46)

Updates `actions/download-artifact` from 7.0.0 to 8.0.1
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](actions/download-artifact@v7...3e5f45b)

Updates `pypa/gh-action-pypi-publish` from 1.13.0 to 1.14.0
- [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases)
- [Commits](pypa/gh-action-pypi-publish@v1.13.0...cef2210)

Updates `zizmorcore/zizmor-action` from 0.5.3 to 0.5.6
- [Release notes](https://github.com/zizmorcore/zizmor-action/releases)
- [Commits](zizmorcore/zizmor-action@b1d7e1f...5f14fd0)

---
updated-dependencies:
- dependency-name: prefix-dev/setup-pixi
  dependency-version: 0.9.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: codecov/codecov-action
  dependency-version: 6.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: github/issue-metrics
  dependency-version: 4.2.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: j178/prek-action
  dependency-version: 2.0.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: actions/upload-artifact
  dependency-version: 7.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
- dependency-name: actions/download-artifact
  dependency-version: 8.0.1
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
- dependency-name: pypa/gh-action-pypi-publish
  dependency-version: 1.14.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: actions
- dependency-name: zizmorcore/zizmor-action
  dependency-version: 0.5.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ocol

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Array no longer wraps an AsyncArray. It owns metadata, store_path, config,
codec_pipeline, _chunk_grid, and a pluggable _runner (defaulting to
SyncRunner). Adds Array._from_async_array and a deprecated async_array
property. External Array(async_array) construction sites are converted to
Array._from_async_array. Fixes downstream typing fallout from removing the
_async_array attribute.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…roperty

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…runner

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…unner

Routes resize/append/update_attributes/nchunks_initialized/nbytes_stored/
info_complete through self._runner.run(self.*_async(...)), which mutate the
live Array. Fixes resize/append not updating array state. Array no longer
delegates to the deprecated async_array property.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hods

Adds get/set_{orthogonal,mask,coordinate,block}_selection_async to Array and
migrates tests off the deprecated async_array property where an Array async
equivalent now exists.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Eliminates duplicated indexer construction and coordinate value-validation
by routing each sync selection method through self._runner.run of its
*_async twin. Adds get/set_basic_selection_async for a complete surface.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- update_attributes (sync) returns a fresh Array, preserving the prior contract
- from_array docstring example uses a public construction path
- align SupportsArrayState._iter_shard_keys signature with the real methods
- restore AsyncArray coverage in test_get_shape_chunks
- extract shared sharding-codec helper to dedup Array/AsyncArray properties

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 4, 2026

Codecov Report

❌ Patch coverage is 99.31034% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.56%. Comparing base (b9d3964) to head (bcbfe51).

Files with missing lines Patch % Lines
src/zarr/api/synchronous.py 93.33% 1 Missing ⚠️
src/zarr/core/group.py 92.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4034      +/-   ##
==========================================
+ Coverage   93.53%   93.56%   +0.02%     
==========================================
  Files          88       88              
  Lines       11894    12027     +133     
==========================================
+ Hits        11125    11253     +128     
- Misses        769      774       +5     
Files with missing lines Coverage Δ
src/zarr/core/array.py 97.68% <100.00%> (-0.20%) ⬇️
src/zarr/core/attributes.py 96.15% <100.00%> (ø)
src/zarr/core/sync.py 94.64% <100.00%> (+0.30%) ⬆️
src/zarr/metadata/migrate_v3.py 98.36% <100.00%> (ø)
src/zarr/testing/stateful.py 36.46% <ø> (ø)
src/zarr/api/synchronous.py 92.95% <93.33%> (ø)
src/zarr/core/group.py 95.20% <92.30%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Softens the constructor break: Array(async_array) still works but emits a
DeprecationWarning, constructing from the async array's metadata/store_path/
config. The new Array(metadata, store_path, ...) form is preferred.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@d-v-b d-v-b changed the title one array class refactor: one array class Jun 4, 2026
d-v-b and others added 4 commits June 4, 2026 13:47
The docs build guards that every python block declares exec/test; the new
custom-runner example was a bare fence. Mark it exec="true" (it constructs
an Array with a custom runner, which runs cleanly) and drop the unused Runner
import.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…compressor/filters)

Closes coverage gaps introduced by the Array unification: the store_path-required
TypeError, __eq__ NotImplemented path, the sharded read_chunk_sizes/_chunk_grid_shape
branch, and the v2/v3 compressor and v2 filters property branches.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s, iterators, async_array

- legacy Array(async_array) raises if store_path/config also supplied
- update_attributes_async returns a fresh Array, consistent with the sync form
- align Array._iter_shard_coords signature with sibling iterators
- async_array property left uncached: resize/append replace metadata so caching would be stale

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@d-v-b
Copy link
Copy Markdown
Contributor Author

d-v-b commented Jun 4, 2026

the main breaking change I was worried about here is Array.__init__, which previously took a single AsyncArray argument, e.g. Array(my_async_array). This still works on this branch, but it's deprecated. The preferred constructor for Array looks much more user-friendly now -- you pass metadata, a storepath, etc.

@d-v-b d-v-b marked this pull request as ready for review June 4, 2026 12:58
@d-v-b d-v-b requested review from dcherian and mkitti June 4, 2026 12:58
Copy link
Copy Markdown
Member

@maxrjones maxrjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you choose this approach over using AsyncArray as the single source of truth, and making Array a facade holding a reference to it? The regression section details the most critical issue with this PR, and would be prevented by having AsyncArray as the single source of truth. Using AsyncArray as the single source of truth would reduce the amount of code needed by a few hundred lines (see duplication section). A facade Array can still expose every *_async method and the runner-dispatched sync wrappers by delegating to the held AsyncArray, so the user-facing API of this PR is identical either way.

 The regression and the code duplication below are two symptoms of the same root cause: after this PR, Array and AsyncArray each independently own array state and derive behavior from it. Caching the handle would fix the regression and delegating to the shared helpers would fix the method duplication. Even with those changes, the state and the derived properties remain mirrored across both classes, and every future change must land on both sides. The facade removes the second source of truth instead of patching its symptoms one at a time.

Also, what's the use-case for a synchronous Array having a custom runner? I would think that most people who want to bring an event loop would just use the async methods. It seems like a YAGNI case, and not worth exposing publicly in this PR since it's not usable via zarr.open_array, zarr.create_array, etc.

regression

On main, Array.async_array returned the shared backing _async_array, and all Array properties read through it — so arr.async_array.resize((N,)) updated arr.shape too. On this branch the property constructs a fresh, throwaway AsyncArray(self.metadata, self.store_path, self.config) on every access. Three consequences, all verified:

  • arr.async_array.resize((N,)) — a previously working pattern — now runs _resize against the detached temporary: the store metadata is updated (array.py:6683) but arr.metadata/arr.shape stay stale. Subsequent arr[...] reads/writes index against the old shape while the store has the new one.
  • aa = arr.async_array; arr.resize(...) leaves aa holding the pre-resize metadata, since resize rebinds arr.metadata via object.__setattr__ on arr only (array.py:4634, 6686).
  • arr.async_array is arr.async_array is now False, breaking identity-based caching.

The changelog frames async_array as "deprecated but still works for now"; for mutating operations it does not work — it silently desynchronizes the handle from the array.

code duplication avoided by using AsyncArray as the single source of truth

AsyncArray's selection methods are one-line delegations to shared module-level helpers:

# AsyncArray.get_orthogonal_selection (array.py:1544)
return await _get_orthogonal_selection(
    self.store_path, self.metadata, self.codec_pipeline, self.config, self._chunk_grid,
    selection, out=out, fields=fields, prototype=prototype,
)

The new Array.get_orthogonal_selection_async re-inlines the body of that same helper instead of calling it:

# Array.get_orthogonal_selection_async (array.py:2865)
if prototype is None:
    prototype = default_buffer_prototype()
indexer = OrthogonalIndexer(selection, self.shape, self._chunk_grid)
return await self._get_selection(indexer=indexer, out=out, fields=fields, prototype=prototype)
# _get_orthogonal_selection — the existing shared helper (array.py:6336)
if prototype is None:
    prototype = default_buffer_prototype()
indexer = OrthogonalIndexer(selection, metadata.shape, chunk_grid)
return await _get_selection(..., indexer=indexer, out=out, fields=fields, prototype=prototype)

The same pattern repeats for the mask, coordinate, and block selection getters and setters. Beyond the methods:

  • Array._info (array.py:1954) is a byte-for-byte copy of AsyncArray._info (array.py:1825) — a 20-line ArrayInfo construction duplicated verbatim.
  • The __init__ state-derivation block (parse_array_metadata / ChunkGrid.from_metadata / create_codec_pipeline) is duplicated between AsyncArray.__init__ (array.py:415–426) and Array.__init__ (array.py:1892–1902).
  • The derived properties (order, read_only, filters, serializer, compressors, nchunks, …) are mirrored across both classes.

With the docstrings these twins carry, that's the few hundred lines. Each pair is a place where a future fix can land on one side and silently miss the other. Sync and async selection silently returning different results for the same call. A facade Array delegating to its held AsyncArray would have exactly one copy of each.

Comment thread src/zarr/core/array.py
Comment on lines +206 to +210
"""Return the array's sharding codec, or `None` if the array is not sharded.

An array is considered sharded when its metadata declares exactly one codec
and that codec is a `ShardingCodec`.
"""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Return the array's sharding codec, or `None` if the array is not sharded.
An array is considered sharded when its metadata declares exactly one codec
and that codec is a `ShardingCodec`.
"""
"""Return the array's sole sharding codec, or `None`.
The gate used by the chunk/shard accessors: sharding is reported only
when the sharding codec is the only declared codec, because any other
codec (e.g. ``codecs=[sharding, gzip]``) makes the inner chunks not
independently addressable. Not a general sharded-ness predicate; see #4036.
"""

I think it's important to document that this won't always return the sharding codec to prevent misuse.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the code today actually do the check you are describing?

@maxrjones
Copy link
Copy Markdown
Member

Here's Claude's evaluation of the pros/cons of either class being the facade, which I think is the cleanest way to avoid duplicate state. It seems to come down to how much diff/compatibility risk we accept now in exchange for Array being the real implementation and AsyncArray's eventual removal being trivial (i.e., whether #4027's removal goal is a commitment or an aspiration) My preferred goals are to reduce the duplication between the two classes, make whatever duplication remains mechanically checkable (see "Keeping the mirrored pairs in sync" below), advise users toward the single class, but keep the other around in perpetuity so people's code keeps working. By the criterion at the end, this position lands on Option A, with a parity check making the permanent mirroring affordable.

Note, a third shape exists where both classes hold a shared mutable state object, with methods delegating to the module-level helpers this branch already has, but it adds a third class without changing the analysis below much.

Option A — review's direction: AsyncArray owns implementation, Array facades it

Pros

  • Smallest, safest diff. This is main's architecture (where Array is a dataclass holding a single field, _async_array, with all properties reading through it) plus new *_async delegating methods on Array. The shape is battle-tested; downstream (xarray, etc.) has run against it for years.
  • Regression fixed for free. async_array returns the held object — identity, shared state, mutation coherence all come from there being only one stateful object, with zero new machinery.
  • Async API untouched. AsyncGroup.getitem, zarr.api.asynchronous keep constructing self-contained AsyncArrays; no sync machinery (runner, facade plumbing) enters the async path.
  • Clean layering. Async is the primitive, sync wraps it — the conventional direction; every reviewer's prior matches it.
  • No dataclass-semantics churn. AsyncArray keeps its real fields; anything using dataclasses.replace/fields() on it keeps working.

Cons

  • Fights fold AsyncArray methods into Array #4027's end goal. "Eventually remove an entire class" requires a second migration later — moving every implementation off AsyncArray onto Array, roughly this PR's diff again, plus another deprecation cycle.
  • Array stays second-class. The flagship API of this PR is a pure wrapper: every *_async method is a delegation whose signature + docstring must still be mirrored on Array (the few hundred lines of signatures/docs still exist; only the bodies shrink to one line). The "one array class" goal isn't structurally advanced — it's main with new method names.
  • Two-class mental model persists indefinitely, with the docs burden of explaining both.

Option B — inverted: Array owns implementation (async impls as *_async), AsyncArray facades it

Pros

  • Achieves fold AsyncArray methods into Array #4027's end state now. Sync and async versions of each operation are co-located on one class ("put the two versions as close together as possible"); deleting AsyncArray later is removing a shim, not a migration.
  • The new public API is first-class. Real implementations, real docstrings on Array; no delegation hop for new-API users; deprecation pressure points at the legacy class, which is the thin one.
  • Same regression fix, same deduplication. A cached Array.async_array view sharing state fixes all three symptoms; every duplication item in the review (re-inlined selection bodies, _info, __init__ derivation, mirrored properties) collapses to one copy — the line savings are identical to Option A.
  • One-time cost. The read-through plumbing you write now is the last restructuring; Option A's ease is partly deferred cost.

Cons

  • Larger, riskier diff now. AsyncArray's fields (metadata, store_path, config, codec_pipeline, _chunk_grid) become read-through properties over a held _array; all its construction sites (group.py:791, group.py:3506, asynchronous.py:369, ~77 references in group.py) flow through the facade.
  • Dataclass breakage. AsyncArray is @dataclass(frozen=True) with real fields today; converting fields to properties changes dataclasses.fields(), replace(), generated __eq__/repr. Needs an audit of internal and downstream replace() use — this is the most concrete compat hazard.
  • Layering inversion. Every AsyncArray now wraps an Array carrying a _runner it never uses; the async path depends on the unified class's sync machinery. Mostly aesthetic, but it's the part reviewers will push on.
  • The shim still costs its lines — and dynamic delegation doesn't rescue it yet. One might hope the facade could be generated dynamically (__getattr__/class decorator), but AsyncArray is not deprecated for async users — it's the typed return value of the entire public async API. Until that changes, the facade must be fully typed and documented, so it's hand-written one-line delegations with mirrored signatures, not __getattr__ magic. Dynamic delegation only becomes acceptable in a later phase, once the async API returns Array (or equivalent) and AsyncArray is deprecated end-to-end.
  • Property-hop overhead on AsyncArray.metadata-style access in internal async hot paths. Almost certainly negligible, but it's new indirection where Option A adds none.

The symmetric parts (don't discriminate)

  • Duplication eliminated: identical in both.
  • Single-source-of-truth / "changes land once": identical in both.
  • The mirrored-signatures+docstrings burden: identical in both — it just sits on Array in A and on AsyncArray in B. In B it's temporary (deleted with the shim); in A it's permanent (though see the next section — "permanent" is cheap if it's CI-enforced).
  • Runner YAGNI: orthogonal — keep _runner out of the public constructor signature in either design.

Keeping the mirrored pairs in sync mechanically

Whichever class is the facade, the residual duplication is mirrored signatures + docstrings, and that can be guarded by tooling rather than vigilance. The key observation: once the facade's method bodies are one-line delegations, behavior can't drift between the pairs — signature and docstring drift is the only remaining risk, and it's mechanically checkable. In order of weight:

  • Parity test (cheapest — no new infrastructure). A pytest test that pairs foo/foo_async across the two classes by introspection and asserts inspect.signature equality, with an explicit exceptions table for the pairs that legitimately differ (e.g. resize(new_shape) vs resize_async(new_shape, delete_outside_chunks=True)). Drift becomes a CI failure instead of a silent bug, and everything stays hand-written, fully typed, and documented.
  • Docstring sharing. The "This is the asynchronous variant of …" boilerplate can be generated by a small decorator that copies the source-of-truth docstring and prepends the variant note, so the prose is written once.
  • Full codegen (heaviest). Generate the entire facade as checked-in source from the implementation class, with a CI step that fails if regeneration differs — the unasync pattern (psycopg 3 maintains its whole sync API this way; urllib3/httpcore have used unasync). Zero hand-maintenance with full typing and docs, but the generator needs the same exceptions table and is real infrastructure to own.

To be clear, this is a complement to the facade, not a substitute for it: the parity test only guards signatures. The regression above (shared-state desync) and the re-inlined helper bodies are behavioral problems that only the facade fixes.

Decision criterion

It reduces to one question: is #4027's "eventually remove AsyncArray" a commitment or an aspiration? If there's consensus to actually deprecate and remove it (including changing what the async API returns — a bigger conversation than this PR), B's extra diff is a down payment on work that must happen anyway, and A means doing this restructuring twice. If removal is indefinite, B's costs (dataclass churn, facade plumbing, layering inversion) buy nothing that A doesn't deliver more cheaply — especially with a parity test making A's permanent mirroring a CI-enforced non-problem rather than a vigilance problem — and A is simply the better engineering call. A defensible middle path: take A now (it's the minimal fix for the review's confirmed regression and duplication), and treat B as the mechanical follow-up PR that executes the removal decision once it's actually made — A doesn't foreclose B, it just doesn't prepay for it.

d-v-b and others added 7 commits June 7, 2026 09:39
…tion getters

Array.async_array previously returned a fresh, throwaway AsyncArray on every
access, so mutations through the handle (e.g. resize) silently desynced from
the parent Array, and handle identity was unstable. Introduce _AsyncArrayView,
a cached view whose state reads through the parent Array and whose mutations
land on it via a single _rebind_state seam (replacing the raw object.__setattr__
poke in _resize). Array now fully owns its state and AsyncArray depends on it,
so the eventual AsyncArray removal is deleting a shim rather than a migration.

Also addresses the duplication flagged in review:
- Array's orthogonal/mask/coordinate *_async getters now delegate to the shared
  _get_*_selection helpers instead of re-inlining indexer construction.
- The byte-identical AsyncArray._info / Array._info bodies collapse into one
  module-level _array_info helper.

Add a signature-parity test over Array's foo/foo_async pairs (with a documented
exceptions table for getitem/setitem/resize) so the residual mirrored signatures
are CI-enforced rather than vigilance-dependent.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Use ZarrDeprecationWarning (not plain DeprecationWarning) for the legacy
  Array(async_array) constructor and the Array.async_array property, matching
  the rest of array.py so users filtering on the zarr-specific category catch
  them. (Medium)
- Sync update_attributes now returns the Array built by update_attributes_async
  instead of discarding it and constructing a second one. (Low)
- test_nchunks: parametrize over the real [Array, AsyncArray] classes and make
  the AsyncArray branch actually construct and test an AsyncArray; the previous
  [AnyArray, AnyAsyncArray] aliases never compared equal to Array, so both
  branches ran identical assertions and the async path was never exercised. (Low)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…fork

Address roborev findings (job 255), both about locking in already-correct
behavior with tests rather than restructuring:

- test_sync_async_property_parity: the ~24 scalar/derived properties that are
  reimplemented independently on Array and AsyncArray are now asserted to agree
  by value across a v2/v3 x sharded/unsharded matrix, so a fix landing on one
  class but not the other fails CI (the bodies can no longer drift silently).
- test_async_handle_with_config_returns_detached_async_array: asserts
  Array.async_array.with_config(...) returns a real AsyncArray (not another
  view bound to the parent) and that the result is detached, so later parent
  mutations don't leak into it.

Verified the parity test catches drift by temporarily breaking Array.order and
confirming the v3 cases failed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address roborev finding (job 257): the property-parity matrix builds both the
Array and the AsyncArray from the same freshly-created array with default
config and a writable store, so `order` (v3 reads self.config.order) and
`read_only` (reads self.store_path.read_only) were always compared against
identical inputs and could never reveal drift.

Add two targeted tests that build the pair from a non-default order ("F", v3)
and a read-only store, so those config/store-derived branches are
cross-checked divergently. Verified both fail if Array.order / Array.read_only
drift from their AsyncArray twins.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address roborev finding (job 259): the docstrings cross-referenced __getitem__/
__setitem__, but those sync dunders pop_fields and route fancy/orthogonal
selections to vindex/get_orthogonal_selection, whereas getitem_async/
setitem_async call _getitem/_setitem directly and support basic indexing only.
A caller following the old cross-reference with a fancy selection would hit a
BasicIndexer error or get different results than arr[...].

Reword both to state they are the async counterparts of get_basic_selection/
set_basic_selection (basic indexing only) and point to the orthogonal/
coordinate/mask *_async methods for advanced indexers. Doc-only; behavior
unchanged (matches the long-standing AsyncArray.getitem convention).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@d-v-b
Copy link
Copy Markdown
Contributor Author

d-v-b commented Jun 7, 2026

It reduces to one question: is #4027 "eventually remove AsyncArray" a commitment or an aspiration?

Yes, the whole point is to have a single class that has the sync and async apis. That class should be the Array class, not the AsyncArray class, because the Array class is the primary user-facing class. Once the Array class does everything the AsyncArray class used to do, there is no reason for the AsyncArray class to exist, and we can remove it. Consolidating functionality on the Array class and eventually removing the AsyncArray class is the entire goal of this effort.

@d-v-b
Copy link
Copy Markdown
Contributor Author

d-v-b commented Jun 7, 2026

the regressions are fixed and I (via claude) added tests to ensure that the Array and AsyncArray have consistent API surfaces.

Array.setitem_async marked `prototype` keyword-only, but AsyncArray.setitem
(the API consumers are migrating off) accepts it positionally. A caller doing
`await async_array.setitem(sel, value, proto)` would hit a TypeError after
switching to `arr.setitem_async(sel, value, proto)`. Drop the `*` so prototype
is positional-or-keyword, matching AsyncArray.setitem exactly.

Add a kind-aware, cross-class signature test asserting Array.getitem_async /
setitem_async match AsyncArray.getitem / setitem including each parameter's
kind (the existing within-Array parity test compares names only and excludes
getitem/setitem), plus a test that setitem_async accepts a positional
prototype.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants