Skip to content

geotiff: honour window and max_pixels for stripped HTTP reads (#1842)#1851

Merged
brendancol merged 2 commits into
mainfrom
issue-1842
May 14, 2026
Merged

geotiff: honour window and max_pixels for stripped HTTP reads (#1842)#1851
brendancol merged 2 commits into
mainfrom
issue-1842

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Fixes #1842.

  • _fetch_decode_cog_http_tiles had a not ifd.is_tiled branch that called source.read_all() and sliced the result. That dropped two safety contracts the tiled branch had from Apply tile byte cap to local files + SSRF hardening on HTTP geotiff reads #1664 and VRT source reads with small max_pixels reject normal-tile sources #1823: (1) the comment a few lines below promises windowed HTTP reads "only allocate the window", but stripped COGs were downloading the whole raster anyway; (2) the caller's max_pixels was silently swapped for MAX_PIXELS_DEFAULT (1B), so a 50x50 window read with max_pixels=2500 against a huge stripped file went through unchecked.
  • Move the work into a new _fetch_decode_cog_http_strips helper. For windowed reads it computes which strips intersect [r0:r1], fetches only those byte ranges via read_ranges_coalesced (same coalescing logic as the tiled path), decodes each via _decode_strip_or_tile, and places each strip's intersection with the window into the output. Per-strip byte cap (Apply tile byte cap to local files + SSRF hardening on HTTP geotiff reads #1664), chunky vs planar layouts, and sparse strips all follow _read_strips. For full-image reads the path stays on read_all + _read_strips, but max_pixels is now threaded through.

Test plan

  • new regression tests in xrspatial/geotiff/tests/test_http_stripped_window_max_pixels_issue_A_1842.py:
    • byte-range coverage: a window covering one strip issues exactly one strip GET at that strip's offset, and read_all is never called.
    • windowed max_pixels honoured: a 50x50 window with max_pixels=2500 succeeds against a 65,536-pixel source; the same window with max_pixels=2499 raises PixelSafetyLimitError.
    • full-image read still capped: window=None with max_pixels=100 against a 65,536-pixel source raises PixelSafetyLimitError.
    • parametrised round-trip parity across six window shapes (strip-boundary spans, tiny straddling windows, last strip, full image).
  • full xrspatial/geotiff/tests/ suite passes apart from 11 pre-existing failures on main (matplotlib + Python 3.14 deepcopy recursion, GPU tests without a GPU, an unrelated tile-size validation test); none of them touch the HTTP path.
  • HTTP-adjacent suites (test_http_*, test_cog_http_*, test_reader.py, test_local_tile_byte_cap_1664.py, test_ssrf_hardening_1664.py) all pass: 106/106.

The stripped branch of _fetch_decode_cog_http_tiles called source.read_all() and sliced the decoded array, so a small windowed read of a stripped multi-billion-pixel COG still downloaded the whole raster and the caller's max_pixels was dropped in favour of MAX_PIXELS_DEFAULT (1B). That broke the safety contract the tiled branch got from #1664 and #1823. Move the work into a new _fetch_decode_cog_http_strips helper that builds the list of strip byte ranges intersecting the requested window, coalesces them via read_ranges_coalesced (same as the tiled path), and decodes each strip into a windowed output. max_pixels now bounds the materialised pixel count: the window for windowed reads, the full image otherwise. When window is None the path stays on read_all + _read_strips, but max_pixels is threaded through so the full-image dim check uses the caller's cap.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 14, 2026
@brendancol brendancol requested a review from Copilot May 14, 2026 17:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes stripped GeoTIFF HTTP reads so windowed requests avoid full-file reads and apply the caller’s max_pixels budget consistently with the tiled HTTP path.

Changes:

  • Adds _fetch_decode_cog_http_strips for HTTP range-based stripped TIFF reads.
  • Threads max_pixels into stripped full-image HTTP reads.
  • Adds regression tests for byte-range behavior, window/full-image pixel caps, and window parity.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
xrspatial/geotiff/_reader.py Adds the stripped HTTP fetch/decode helper and routes non-tiled HTTP reads through it.
xrspatial/geotiff/tests/test_http_stripped_window_max_pixels_issue_A_1842.py Adds regression coverage for stripped HTTP windowing and pixel safety behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread xrspatial/geotiff/_reader.py Outdated
Comment on lines +2032 to +2046
# Per-strip compressed-byte cap (#1664). Mirrors the local strip path
# and the tiled HTTP path: a crafted ``StripByteCounts`` entry can
# request an unbounded HTTP Range GET or decompress a few KiB into
# gigabytes. Apply the cap before any fetch list is built.
max_tile_bytes = _max_tile_bytes_from_env()
for _strip_idx, _bc in enumerate(byte_counts):
if _bc > max_tile_bytes:
raise ValueError(
f"TIFF strip {_strip_idx} declares "
f"StripByteCount={_bc:,} bytes, which exceeds the "
f"per-strip safety cap of {max_tile_bytes:,} bytes. "
f"The file is malformed or attempting denial-of-service. "
f"Override via XRSPATIAL_COG_MAX_TILE_BYTES if this file "
f"is legitimate."
)
strip_rows = min(rps, height - strip_row)
if strip_rows <= 0:
continue

…code-size guard

Address two copilot reviews on #1851. The per-strip byte cap previously
ran over every strip before the windowed path picked which to fetch, so
a small window could be rejected because of an unrelated oversized
strip; move the check into the fetch-range loop so it mirrors the
tiled HTTP path. Separately, the windowed path passed
width * strip_rows * strip_samples to the decoder without an absolute
dimension cap, so a small window touching a very large strip could
force an outsized decompression allocation. Add a per-strip
MAX_PIXELS_DEFAULT guard before _decode_strip_or_tile, matching the
per-tile guard in the tiled HTTP path.
@brendancol
Copy link
Copy Markdown
Contributor Author

Implemented both: per-strip byte cap now lives inside the windowed fetch-range loop (mirrors the tiled path), and decoded strip size goes through a MAX_PIXELS_DEFAULT guard before _decode_strip_or_tile. Regression tests added for both cases.

@brendancol brendancol merged commit 797e3e2 into main May 14, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stripped HTTP TIFF reads ignore window and caller max_pixels

2 participants