Skip to content

fix: entity refs with n saveto/vars in different groups/scopes#827

Merged
lognaturel merged 7 commits intoXLSForm:masterfrom
lindsay-stevens:pyxform-823
Mar 15, 2026
Merged

fix: entity refs with n saveto/vars in different groups/scopes#827
lognaturel merged 7 commits intoXLSForm:masterfrom
lindsay-stevens:pyxform-823

Conversation

@lindsay-stevens
Copy link
Contributor

@lindsay-stevens lindsay-stevens commented Mar 11, 2026

Closes #822
Closes #823
Closes #825
Closes #826

Why is this the best possible solution? Were any other approaches considered?

Satisfies existing tests and new cases from the bug reports.

  • addresses issues where:
    • savetos for one list were in multiple groups
    • variable refs were in multiple valid scopes
    • nested savetos for multiple lists had conflicting meta/entity blocks
    • entity block allocated to group unnecessarily

Something to consider is the new test test_ref_scope_conflict__depth_1_sibling_repeat__var_only__indexed_repeat__error shows a seemingly valid ${} variable usage that is currently blocked by an error. The error assumes that any usage of that pattern is wrong but it is possible to write a form as shown which is correct. This could be considered an limitation, in that the error might help more than it hinders, and any false positives could be worked around by using a literal XPath instead of the offending ${} variable, assuming there is already at least one ${} or save_to to allocate the entity in the desired node/scope. Alternatively the error could be downgraded to a warning (first). It may be complicated to implement an exception for the valid expression case i.e. to parse it such that it can be identified as a probably valid usage.

Also a minor note for 1dfc1d7 / #825 - maybe this is not worth keeping long term? I added it to the PR in case it seemed important. For example I assume #825 came about due to references to/from the entity (or something else like old Central version) breaking, rather than from inspecting the XForm as such.

What are the regression risks?

As seen on the forum some users may have already adapted forms to work around the closed issues. So it's possible that their adapted forms produce a different result (different entity location or different error) after this, depending on what changes were made to the form.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

None other than what was already planned.

Before submitting this PR, please make sure you have:

  • included test cases for core behavior and edge cases in tests
  • run python -m unittest and verified all tests pass
  • run ruff format pyxform tests and ruff check pyxform tests to lint code
  • verified that any code or assets from external sources are properly credited in comments

- addresses issues where:
  - savetos for one list were in multiple groups
  - variable refs were in multiple valid scopes
- when there are multiple properties for an entity in different
  containers, reserve/hold all containers in the path between where the
  save_to's are and where the meta/entity block is placed, so that no
  other meta/entity blocks can go in there and steal the save_to
- prefer entity allocations to the deepest scope, whether that is
  written earlier or later in the form, so that references to outside
  that scope can work properly
- prefer entities in the order they are written in the entities sheet,
  rather than also considering presence of save_to and the reference
  depth. The latter two factors aren't needed and probably make it
  harder to control / understand i.e. users can rearrange the entities
  sheet to more directly get different results.
- instead of just the full path length, find the deepest boundary scope
  first and then among that scope find the deepest container
- tidied up tests to use xpath helper instead of xpath string
- added test cases specific to bug reports and some extra variations
- in get_allocation_request, swap from set to dict to preserve order of
  saveto_lineages so that the conflict_dataset row number is stable
- get_scope_boundary not needed on ReferenceSource since it's on path
- probably clearer to make get_scope_boundary_node_count zero based
  since get_scope_boundary_subpath_node_count is zero based
@lindsay-stevens lindsay-stevens marked this pull request as ready for review March 13, 2026 15:39
@lognaturel
Copy link
Contributor

Phew, thanks for this quick fix! I was really dancing around the core regression with the issues I filed and I'm so sorry I didn't catch it before release. I got really focused on the new functionality and didn't do enough thinking about existing forms which is an important lesson to re-learn.

I assume #825 came about due to references to/from the entity (or something else like old Central version) breaking, rather than from inspecting the XForm as such.

Exactly, @ramaritz confirmed at #825 (comment) that they were using an old Central version. Given they reported it so quickly I think it's likely others are having the same issue. I think the comment makes the code super clear and it feels worth keeping. It's far, far from the most complex part of this work!

any false positives could be worked around by using a literal XPath instead of the offending ${} variable

That's a really interesting one. I can imagine it for something like a household survey where the first repeat instance represents a head of household. I think this workaround is sufficient -- someone writing this kind of form likely prefers raw XPath to indexed-repeat anyway. We can add some notes about cases like this in the docs as well. I agree it's not worth adding an exception to detect it as valid.

different entity location or different error

Good point, I'll make sure to message about that. We're pretty confident different Entity locations wouldn't make a difference as long as the same savetos are recognized and I can't currently imagine how they would get new errors but you're right we should encourage another round of testing.

],
)

def test_unsolvable_meta_topology__depth_1_repeat__conflict_2_group__saveto_only__error(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really like the test sequences of "this is impossible", "these are form design approaches to make it possible".

@lognaturel
Copy link
Contributor

I'm pretty sure repeat-outer-inner-group.xlsx was in fact broken in the released version. Now I have tried it very carefully and seen that there are similar shapes in tests.

for scope_path, requests in scope_paths.items():
scope_path_depth_limit = len(scope_path.nodes) - 1

# Prioritise save_to references but otherwise try to put deepest allocation first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that's not quite working as intended?

    md = """
    | survey |
    | | type         | name | label | save_to |
    | | begin_repeat | r1   | R1    |         |
    | | begin_group  | g1   | G1    |         |
    | | text         | q1   | Q1    | e2#p1   |
    | | text         | q2   | Q2    |         |
    | | end_group    |      |       |         |
    | | begin_group  | g2   | G2    |         |
    | | text         | q3   | Q3    |         |
    | | end_group    |      |       |         |
    | | end_repeat   |      |       |         |

    | entities |
    | | list_name | label                   |
    | | e1        | concat(${q1}, ${q2}, ${q3}) |
    | | e2        | E2                      |
    """

I would expect e2 to be allocated first to g1 and then e2 to r1.

allocation-order.xlsx

Swapping the declaration order on the entities sheet works as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue for this case filed in #829

@lognaturel lognaturel merged commit bff91ea into XLSForm:master Mar 15, 2026
14 checks passed
@lindsay-stevens lindsay-stevens deleted the pyxform-823 branch March 16, 2026 16:39
lindsay-stevens added a commit to lindsay-stevens/pyxform that referenced this pull request Mar 16, 2026
- test cases follow review feedback from pyxform/XLSForm#827
- entities_parsing.py:
  - L501: remove comment that doesn't add much (covered by docstring)
  - L680: update comment to reflect current code behaviour
lindsay-stevens added a commit to lindsay-stevens/pyxform that referenced this pull request Mar 16, 2026
- test cases follow review feedback from pyxform/XLSForm#827
- entities_parsing.py:
  - L501: remove comment that doesn't add much (covered by docstring)
  - L680: update comment to reflect current code behaviour
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants