Skip to content

feat: safe type expansion, NVARCHAR/NCHAR catalog fix, seed empty cell fix#702

Merged
axellpadilla merged 11 commits into
masterfrom
fix/missing-type-handling
Jun 25, 2026
Merged

feat: safe type expansion, NVARCHAR/NCHAR catalog fix, seed empty cell fix#702
axellpadilla merged 11 commits into
masterfrom
fix/missing-type-handling

Conversation

@axellpadilla

@axellpadilla axellpadilla commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR addresses several long-standing issues with SQL Server native type handling during column expansion, catalog generation, and seed ingestion, considered similar problems, reproduced and handled on one run.

Closes: #701, #637, #425, #446
Supersedes: #606, thanks @Cogito

What Changed

1. SQL Server Native String Type Recognition (sqlserver_column.py)

  • is_string() now includes nvarchar and nchar in addition to varchar and char
  • string_type_instance() — new instance method that preserves the original type family:
    • nvarchar(n) emits nvarchar(n) (not varchar(n))
    • nchar(n) emits nchar(n) (not char(n))
    • Falls back to varchar(n) / char(n) for non-Unicode types
  • data_type property now uses string_type_instance() instead of string_type()
  • is_numeric() — includes is_fixed_numeric and is_decimal_numeric
  • is_fixed_numeric() — new method for money/smallmoney
  • is_decimal_numeric() — new method for decimal/numeric
  • is_integer() now includes tinyint and bit
  • can_expand_to() — stricter: only allows same-family string size increases (e.g., varchar(10)varchar(25))
  • can_expand_safe() — new method for flag-gated safe expansions:
  • max — string can now be expanded to max
Source Target Allowed?
varchar(n) nvarchar(m) where m >= n With flag
char(n) nchar(m) where m >= n With flag
bittinyintsmallintintbigint Higher in family With flag
int numeric(p,s) where p >= 10 With flag
numeric(p,s) numeric(p2,s2) where p2 >= p and s2 >= s With flag
smallmoney money With flag
money numeric(p,s) where p >= 19 With flag

2. Safe Type Expansion Feature Flag (sqlserver_adapter.py)

New dbt_sqlserver_enable_safe_type_expansion behaviour flag (default: false):

# dbt_project.yml
flags:
  dbt_sqlserver_enable_safe_type_expansion: true

When enabled, the adapter's expand_column_types() override performs:

  1. Same-family string resizes — always proceed (e.g., varchar(10)varchar(25))
  2. Safe type expansions — only when flag is enabled AND column_type_expansion_max_rows is not exceeded:
    • Cross-family string: varchar/charnvarchar/nchar
    • Integer family promotions
    • Integer → numeric with sufficient precision
    • numeric/decimal precision/scale upgrades
    • Fixed-money promotions (smallmoneymoneynumeric)

expand_target_column_types() — new public API that forwards the max_rows parameter, called from incremental and snapshot materializations.

alter_column_type() — new method that dispatches to the sqlserver__alter_column_type macro, replacing the base adapter's implementation.

3. Row-Count Guardrail (column_type_expansion_max_rows)

New per-model config (default: 1,000,000):

{{ config(materialized='incremental', unique_key='id',
           column_type_expansion_max_rows=500000) }}
  • Safe type expansion is skipped when the table exceeds this row count
  • Set to -1 to disable the check
  • Set to 0 to always skip safe expansion (only same-family string resizes proceed)
  • Skipped expansions emit a warning log with the row count and limit

4. Single ALTER COLUMN Mode (prefer_single_alter_column)

New per-model config (default: false):

{{ config(materialized='incremental', unique_key='id',
           prefer_single_alter_column=true) }}

When true, the sqlserver__alter_column_type macro uses a single ALTER COLUMN statement instead of the safer add+update+drop+rename pattern. This is faster for small/medium tables and instant for safe type expansions, but may fail for types that cannot be implicitly converted.

5. Catalog Fix (catalog.sql)

Changed sys.types join from system_type_id to user_type_id in both catalog queries. This prevents NVARCHAR/NCHAR columns from appearing as SYSNAME in dbt docs generate output. Fixes #637.

6. Seed Empty Cell Fix (helpers.sql)

Changed seed CSV ingestion to inline NULL literals instead of binding empty cells as SQL parameters. Previously, an empty cell in a numeric(18,0) column would be bound as an empty string parameter, causing arithmetic overflow error 8115. Now empty cells emit null directly in the VALUES clause. Fixes #425.

7. Adapter Configs (sqlserver_configs.py)

Added two new optional config fields:

  • prefer_single_alter_column: Optional[bool] = False
  • column_type_expansion_max_rows: Optional[int] = None

8. Unit Tests

  • test_sqlserver_column.py — Tests for is_string(), string_type_instance(), data_type, is_fixed_numeric(), is_numeric(), string_size() across all string/numeric type families
  • test_can_expand_to.py — Parameterized tests for can_expand_to() and can_expand_safe() covering same-family resizes, cross-family promotions, integer family promotions, numeric precision/scale upgrades, fixed-money promotions, and prevented shrinking conversions
  • test_expand_column_types.py — Tests for the adapter's expand_column_types() method: row-count skip, max-rows=0 blocking, warning emission, max_rows forwarding through expand_target_column_types()

9. Functional Tests


Note Changes / Migration Notes

  • Use is_number() (covers all numeric types including money)
  • Use is_fixed_numeric() for money types specifically
  • Use is_decimal_numeric() only for numeric/decimal types

Related PRs & History

@Benjamin-Knight

Copy link
Copy Markdown
Collaborator

Reviewing now, big change with lots of potential for issues so may take a bit longer.

@Benjamin-Knight

Copy link
Copy Markdown
Collaborator

When safe type expansion is enabled, some column changes that don't actually fit the existing values. For example, it will widen an int to numeric(10,5) or numeric(10,2) to numeric(12,5), but in those instances we can overflow the integer portion. We can't just check the overall precision and scale, we need to check that the integer section is wide enough.

@Benjamin-Knight Benjamin-Knight left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think there is an issue with the expansion of integers around precision and scale but otherwise just that one shadowed function I don't understand the basis of.

Comment thread dbt/adapters/sqlserver/sqlserver_adapter.py
…l fix

- Add dbt_sqlserver_enable_safe_type_expansion flag for safe column type widening
  (varchar->nvarchar, integer family promotions, numeric precision/scale upgrades)
- Add column_type_expansion_max_rows config (default 1,000,000 rows)
- Add prefer_single_alter_column config for single ALTER COLUMN statement
- Add string_type_instance() to preserve NVARCHAR/NCHAR type family
- Fix catalog generation (user_type_id) so NVARCHAR/NCHAR no longer appear as SYSNAME
- Fix is_numeric() to exclude money/smallmoney (now is_fixed_numeric())
- Fix seed table ingestion of empty numeric cells
- Add tinyint/bit to is_integer() type list
Adds handling for SQL Server MAX string types during
`expand_target_column_types()`.
MAX columns (varchar(max)/nvarchar(max)) are now correctly emitted when
expanding
from bounded strings, and are protected from being inadvertently shrunk
to
bounded sizes. Also removes the unused `alter_column_type` adapter
method.
Introduce `is_decimal_type()` to distinguish true arbitrary-precision
decimal types from fixed-scale money types. Update type expansion logic
to use the new method for precision/scale calculations while keeping
`is_numeric()` inclusive of money types for backward compatibility.
Also clean up redundant test mocks and fix type hints for the column
expansion method.
@axellpadilla axellpadilla force-pushed the fix/missing-type-handling branch from 5874717 to 4871a6b Compare June 19, 2026 07:21
@axellpadilla axellpadilla enabled auto-merge June 19, 2026 07:27
@axellpadilla

Copy link
Copy Markdown
Collaborator Author

@Benjamin-Knight I’ve addressed the review comments and also added the max expansion and tests. Your approval would allow us to auto merge this.

@joshmarkovic

joshmarkovic commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Heads-up from #716: I dropped my is_integer() cleanup so this PR can own it. Two small things on the is_integer() change here:

  1. It adds tinyint/bit but keeps the Postgres aliases smallserial/serial/bigserial/int2/int4/int8/serial2/serial4/serial8. SQL Server never returns those, so they're dead. Could be reduced to the actual SQL Server set (tinyint, smallint, int, integer, bigint, plus bit per your expansion design).
  2. Adding bit makes is_number() and the integer-promotion family treat bit as an integer. Looks intentional given the bit / tinyint / smallint expansion chain, just flagging it's a behavior change since bit is SQL Server's boolean type.

@Benjamin-Knight

Copy link
Copy Markdown
Collaborator

@axellpadilla There are 2 open comments, otherwise this looks good to me.

@axellpadilla

Copy link
Copy Markdown
Collaborator Author

@joshmarkovic fixed 1, and yes, 2 is intentional as per oficial docs, bit is an integer, thanks

@axellpadilla axellpadilla merged commit 352d0a9 into master Jun 25, 2026
16 checks passed
@axellpadilla axellpadilla deleted the fix/missing-type-handling branch June 25, 2026 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants