feat: Raise on duplicated aliases by gab23r · Pull Request #291 · Quantco/dataframely

gab23r · 2026-03-03T15:46:17Z

Raises an ImplementationError when a schema contains columns with duplicate aliases.

Motivation

I was using LLM to generate dataframely schemas, it gives me duplicated alias like in:

class MySchema(dy.Schema):
    a = dy.Int64(alias="a")
    b = dy.Int64(alias="a")

It could be nice from dataframely it icould raise in this case.

Changes

Detect duplicate aliases within a single schema
Detect duplicate aliases when inheriting from a parent schema
Add tests for both cases

codecov · 2026-03-03T15:48:56Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (b3edd6a) to head (a421961).

Additional details and impacted files

@@            Coverage Diff            @@
##              main      #291   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           54        54           
  Lines         3121      3125    +4     
=========================================
+ Hits          3121      3125    +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR adds early validation to schema definition to prevent ambiguous schemas where multiple Columns resolve to the same effective column name (via alias), raising ImplementationError during class creation and adding regression tests.

Changes:

Raise ImplementationError when two columns in the same schema resolve to the same name (alias or attribute name).
Raise ImplementationError when a child schema introduces a column whose resolved name clashes with an inherited one.
Add tests covering both same-schema and inherited duplicate-alias scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`dataframely/_base_schema.py`	Adds duplicate-name checks during metadata collection/merging to raise `ImplementationError` early.
`tests/columns/test_alias.py`	Adds tests asserting duplicate aliases are rejected for same-schema and inherited definitions.

Copilot · 2026-03-03T15:50:36Z

dataframely/_base_schema.py

+        if duplicated_column_names := self.columns.keys() & other.columns.keys():
+            raise ImplementationError(
+                f"Columns {duplicated_column_names} are duplicated."
+            )


Metadata.update() now raises on any overlapping column keys between self and other. Because _get_metadata_recursively() traverses __bases__ recursively (not the MRO), diamond / multiple-inheritance patterns can visit the same ancestor more than once (e.g. A->B, A->C, D(B, C)) and would now raise even though the duplicated columns come from the same original definition. Consider de-duplicating bases during metadata collection (e.g. walk a linearized MRO once) or, at minimum, allow overlaps where the column objects are identical and only raise when the overlapping key maps to different Column instances.

Suggested change

if duplicated_column_names := self.columns.keys() & other.columns.keys():

raise ImplementationError(

f"Columns {duplicated_column_names} are duplicated."

)

"""Merge another Metadata instance into this one.

Overlapping keys are allowed if and only if they refer to the *same*

underlying object. This accommodates multiple-inheritance / diamond

patterns where the same base schema is visited more than once.

"""

# Detect conflicting column definitions: same name, different Column instance

duplicated_column_names = self.columns.keys() & other.columns.keys()

conflicting_columns = {

name

for name in duplicated_column_names

if self.columns[name] is not other.columns[name]

}

if conflicting_columns:

raise ImplementationError(

f"Columns {conflicting_columns} are duplicated with conflicting definitions."

)

# Detect conflicting rule factories: same name, different object

duplicated_rule_names = self.rules.keys() & other.rules.keys()

conflicting_rules = {

name

for name in duplicated_rule_names

if self.rules[name] is not other.rules[name]

}

if conflicting_rules:

raise ImplementationError(

f"Rules {conflicting_rules} are duplicated with conflicting definitions."

)

Copilot · 2026-03-03T15:50:36Z

dataframely/_base_schema.py

+            raise ImplementationError(
+                f"Columns {duplicated_column_names} are duplicated."


The exception message uses the raw set representation of duplicated_column_names (derived from key-set intersection). For multiple duplicates this will be non-deterministically ordered, which can make debugging and tests brittle. Consider formatting a stable, sorted list of duplicated names (and optionally quoting each name) instead of interpolating the set directly.

Suggested change

raise ImplementationError(

f"Columns {duplicated_column_names} are duplicated."

formatted_names = ", ".join(

sorted(repr(name) for name in duplicated_column_names)

)

raise ImplementationError(

f"Columns {formatted_names} are duplicated."

raise on duplicated alias

a421961

Copilot AI review requested due to automatic review settings March 3, 2026 15:46

gab23r requested review from AndreasAlbertQC, borchero and delsner as code owners March 3, 2026 15:46

Copilot started reviewing on behalf of gab23r March 3, 2026 15:46 View session

gab23r changed the title ~~Raise on duplicated aliases~~ feat: Raise on duplicated aliases Mar 3, 2026

github-actions bot added the enhancement New feature or request label Mar 3, 2026

Copilot AI reviewed Mar 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Raise on duplicated aliases#291

feat: Raise on duplicated aliases#291
gab23r wants to merge 1 commit intoQuantco:mainfrom
gab23r:raise-duplicated-alias

gab23r commented Mar 3, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 3, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 3, 2026

Uh oh!

Copilot AI Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        if duplicated_column_names := self.columns.keys() & other.columns.keys():
-            raise ImplementationError(
-                f"Columns {duplicated_column_names} are duplicated."
-            )
+        """Merge another Metadata instance into this one.
+        Overlapping keys are allowed if and only if they refer to the *same*
+        underlying object. This accommodates multiple-inheritance / diamond
+        patterns where the same base schema is visited more than once.
+        """
+        # Detect conflicting column definitions: same name, different Column instance
+        duplicated_column_names = self.columns.keys() & other.columns.keys()
+        conflicting_columns = {
+            name
+            for name in duplicated_column_names
+            if self.columns[name] is not other.columns[name]
+        }
+        if conflicting_columns:
+            raise ImplementationError(
+                f"Columns {conflicting_columns} are duplicated with conflicting definitions."
+            )
+        # Detect conflicting rule factories: same name, different object
+        duplicated_rule_names = self.rules.keys() & other.rules.keys()
+        conflicting_rules = {
+            name
+            for name in duplicated_rule_names
+            if self.rules[name] is not other.rules[name]
+        }
+        if conflicting_rules:
+            raise ImplementationError(
+                f"Rules {conflicting_rules} are duplicated with conflicting definitions."
+            )

		raise ImplementationError(
		f"Columns {duplicated_column_names} are duplicated."

Conversation

gab23r commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Changes

Uh oh!

codecov bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gab23r commented Mar 3, 2026 •

edited

Loading

codecov bot commented Mar 3, 2026 •

edited

Loading