Summary
The reader's GDAL_NODATA parsing in _geotags.extract_geo_info (and the related helpers _reader._resolve_masked_fill / _reader._sparse_fill_value) parses the nodata string via float() unconditionally. For 64-bit integer rasters whose declared sentinel is uint64 max (2**64 - 1) or int64 max (2**63 - 1), the float64 round-trip rounds the sentinel to a value that exceeds the dtype's representable range. The downstream integer-mask gate info.min <= int(nodata) <= info.max then rejects the cast and the sentinel pixel survives as a literal valid integer rather than being masked to NaN.
Repro:
import numpy as np
import xarray as xr
import xrspatial.geotiff as g
import tempfile, os
with tempfile.TemporaryDirectory() as d:
arr = np.full((16, 16), 100, dtype=np.uint64)
arr[0, 0] = 2**64 - 1 # uint64 max as sentinel
da = xr.DataArray(arr, dims=['y','x'])
p = os.path.join(d, 't.tif')
g.to_geotiff(da, p, nodata=2**64-1)
out = g.open_geotiff(p)
print(out.dtype, out.values[0, 0]) # uint64 18446744073709551615
# expected: float64 NaN, after sentinel-to-NaN promotion
Same failure pattern for np.int64 with nodata = 2**63 - 1. Both values are unrepresentable in float64; the nearest float rounds up by 1 ULP, which then fails the dtype-range check.
Affected sites:
xrspatial/geotiff/_geotags.py:619 (extract_geo_info — main GDAL_NODATA parse)
xrspatial/geotiff/_reader.py:1166 (_resolve_masked_fill — LERC fill)
xrspatial/geotiff/_reader.py:1299 (_sparse_fill_value — SPARSE_OK fill)
This is the same class of bug _vrt._parse_band_nodata (PR #1833) fixed for the VRT XML parse path: try int() first, fall back to float() only for NaN/Inf/scientific/fractional. The VRT-side fix did not extend to the TIFF source-of-truth parser, so:
- Direct
open_geotiff(uint64.tif) with nodata=uint64 max → sentinel lost.
write_vrt([uint64.tif]) stringifies the float-parsed sentinel into XML as 1.8446744073709552e+19, which the VRT reader subsequently rejects as out-of-range.
Categories: Cat 2 (NaN propagation: sentinel pixel survives unmasked) + Cat 5 (backend inconsistency: VRT XML path handles 64-bit sentinels via _parse_band_nodata but the TIFF path does not).
Fix sketch
Lift the int-first parse from _vrt._parse_band_nodata into a shared helper and reuse it across the three TIFF-side sites. The helper takes the raw string and the target dtype; it tries int(text), falls back to float(text) for NaN/Inf/fractional/scientific, and returns the parsed value at full Python int precision when possible.
Test plan
Summary
The reader's GDAL_NODATA parsing in
_geotags.extract_geo_info(and the related helpers_reader._resolve_masked_fill/_reader._sparse_fill_value) parses the nodata string viafloat()unconditionally. For 64-bit integer rasters whose declared sentinel isuint64max (2**64 - 1) orint64max (2**63 - 1), the float64 round-trip rounds the sentinel to a value that exceeds the dtype's representable range. The downstream integer-mask gateinfo.min <= int(nodata) <= info.maxthen rejects the cast and the sentinel pixel survives as a literal valid integer rather than being masked to NaN.Repro:
Same failure pattern for
np.int64withnodata = 2**63 - 1. Both values are unrepresentable in float64; the nearest float rounds up by 1 ULP, which then fails the dtype-range check.Affected sites:
xrspatial/geotiff/_geotags.py:619(extract_geo_info— main GDAL_NODATA parse)xrspatial/geotiff/_reader.py:1166(_resolve_masked_fill— LERC fill)xrspatial/geotiff/_reader.py:1299(_sparse_fill_value— SPARSE_OK fill)This is the same class of bug
_vrt._parse_band_nodata(PR #1833) fixed for the VRT XML parse path: tryint()first, fall back tofloat()only for NaN/Inf/scientific/fractional. The VRT-side fix did not extend to the TIFF source-of-truth parser, so:open_geotiff(uint64.tif)with nodata=uint64 max → sentinel lost.write_vrt([uint64.tif])stringifies the float-parsed sentinel into XML as1.8446744073709552e+19, which the VRT reader subsequently rejects as out-of-range.Categories: Cat 2 (NaN propagation: sentinel pixel survives unmasked) + Cat 5 (backend inconsistency: VRT XML path handles 64-bit sentinels via
_parse_band_nodatabut the TIFF path does not).Fix sketch
Lift the int-first parse from
_vrt._parse_band_nodatainto a shared helper and reuse it across the three TIFF-side sites. The helper takes the raw string and the target dtype; it triesint(text), falls back tofloat(text)for NaN/Inf/fractional/scientific, and returns the parsed value at full Python int precision when possible.Test plan