Skip to content

Make expandable a list; default expands only DynamicTable columns#1439

Open
bendichter wants to merge 6 commits intodevfrom
default-expandable-vectordata
Open

Make expandable a list; default expands only DynamicTable columns#1439
bendichter wants to merge 6 commits intodevfrom
default-expandable-vectordata

Conversation

@bendichter
Copy link
Copy Markdown
Contributor

@bendichter bendichter commented Apr 7, 2026

Summary

  • Change the default expandable behavior in HDF5IO.write / write_builder / write_group / write_dataset so that only DynamicTable columns and id are made expandable out of the box. Datasets of all other types now default to fixed-shape on-disk layout.
  • Breaking change: expandable is now a list (or tuple) of data type names with default ("VectorData", "ElementIdentifiers"), not a boolean. A dataset is made expandable if its data type (or an ancestor) is in the list — so subclasses of VectorData (e.g. DynamicTableRegion) and ElementIdentifiers are also covered. Pass an empty list/tuple to disable automatic expansion entirely. Passing True/False now raises a TypeError from docval.
  • Subclass detection uses BuildManager.is_sub_data_type, with a get_builder_dt short-circuit so untyped dataset builders don't hit the hierarchy lookup.
  • CHANGELOG entry added under a new Breaking changes heading with migration guidance.

Closes #1437

Test plan

  • TestDefaultExpandableVectorData, VectorIndex, indexed VectorData, and ElementIdentifiers are expandable by default
  • TestDefaultExpandableSubclasses — user-defined subclasses of VectorData / ElementIdentifiers are expandable by default (regression test for the is_sub_data_type fix)
  • TestDefaultExpandableWithReferences — reference-column dataset carries a reference dtype, is expandable, and refs resolve round-trip
  • TestDefaultExpandableExplicitOverride — explicit H5DataIO(maxshape=...) is preserved, not overridden
  • TestExpandableArgexpandable=[] disables expansion; custom list expands only listed types; expandable=True/False raises TypeError
  • TestExpand.test_expand_set_shapeexpandable=['QuxData'] still supports the append path on non-table types
  • Existing TestDTRReferencesDynamicTableRegion with references still works
  • Full test suite passes locally

🤖 Generated with Claude Code

…iers

Change the default `expandable` behavior so that only VectorData (and
subclasses like VectorIndex, DynamicTableRegion, EnumData) and
ElementIdentifiers datasets are automatically made expandable. Other
dataset types now default to contiguous (non-expandable) storage.

The `expandable` parameter is changed from default=True to default=None
with three-way semantics: None (default) expands only table column/id
datasets, True expands all datasets, False expands none.

Closes #1437

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.21%. Comparing base (15f1d77) to head (31e453c).

Additional details and impacted files
@@            Coverage Diff             @@
##              dev    #1439      +/-   ##
==========================================
+ Coverage   93.19%   93.21%   +0.01%     
==========================================
  Files          41       41              
  Lines       10062    10062              
  Branches     2069     2069              
==========================================
+ Hits         9377     9379       +2     
  Misses        409      409              
+ Partials      276      274       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bendichter bendichter requested review from oruebel and rly April 14, 2026 16:01
Comment thread src/hdmf/backends/hdf5/h5tools.py Outdated
rly and others added 3 commits April 14, 2026 23:42
Also update wording in docstrings to clarify that subclasses of both
VectorData and ElementIdentifiers are expanded by default, and add a
roundtrip test that verifies schema-defined subclasses get maxshape set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix docstring grammar in the four expandable docval blocks.
- Add a CHANGELOG entry describing the default expandable change as a bugfix.
- Assert maxshape=(None,) in test_roundtrip_basic_append to explicitly
  cover the expandable=True path on a non-VectorData dataset.
- Add an MWE test confirming docval accepts default=None with type=bool
  while still rejecting wrongly-typed values at runtime.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verifies that expandable=False produces fixed-shape datasets for both
VectorData columns and the ElementIdentifiers id, rather than maxshape=(None,).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@rly
Copy link
Copy Markdown
Contributor

rly commented Apr 15, 2026

Looks good. I made some small changes:

Summary

  • Fixed ElementIdentifiers subclass detection in HDF5IO.write_dataset: replaced name equality (builder_dt == 'ElementIdentifiers') with self.manager.is_sub_data_type(builder, 'ElementIdentifiers') so subclasses are also made expandable by default. Updated the matching docstring wording in the four expandable docval blocks.
  • Added CHANGELOG.md entry under Unreleased → Fixed describing the full default-expandable behavior change.

Tests added

  • TestDefaultExpandableSubclasses (in tests/unit/common/test_table.py) — regression test for the is_sub_data_type fix. Uses create_load_namespace_yaml to define CustomElementIdentifiers and CustomVectorData subclasses, writes a DynamicTable using them, and asserts both id and custom_col get maxshape=(None,).
  • TestExpandableFalse — asserts expandable=False yields fixed shapes (2,) for both a VectorData column and the ElementIdentifiers id.
  • test_docval_none_default_with_non_none_type (in tests/unit/utils_test/test_docval.py) — MWE confirming docval accepts default=None with type=bool (default preserved, wrong types still rejected), justifying the narrow 'type': bool in the expandable docval.
  • test_roundtrip_basic_append — now passes expandable=True explicitly and asserts f['.../my_data'].maxshape == (None,) to cover the force-all path on a non-VectorData dataset.

Comment thread src/hdmf/backends/hdf5/h5tools.py Outdated
…names; update related tests for new behavior
@oruebel
Copy link
Copy Markdown
Contributor

oruebel commented Apr 17, 2026

Looks good to me. I'll let @rly do final approval

…Error check

- Move the six default-expandable tests from tests/unit/common/test_table.py
  to tests/unit/test_io_hdf5_h5tools.py; prune now-unused imports and restore
  PEP 8 double blank line before TestVectorIndexDtype.
- Merge TestExpandableEmpty and TestExpandableCustomList into TestExpandableArg
  with a shared fixture, and add test_bool_raises_type_error to pin the new
  TypeError contract on expandable=True/False.
- Strengthen TestDefaultExpandableWithReferences to assert the reference
  dtype and round-trip resolution of stored refs.
- Update CHANGELOG to call out the change as a breaking change and note the
  silent default layout shift for non-DynamicTable datasets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rly rly changed the title Set expandable=True by default only for VectorData and ElementIdentifiers Make expandable a list; default expands only DynamicTable columns Apr 17, 2026
@rly
Copy link
Copy Markdown
Contributor

rly commented Apr 17, 2026

I updated the changelog and moved the tests to a more appropriate location since expandable is no longer specific to tables. I also updated the PR summary so that the squash commit reflects the current state of changes. This is technically a breaking change and #1440 will be a major change in behavior, so the next release will be 6.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Set expandable=True by default only for VectorData datasets

3 participants