geotiff: honour window and max_pixels for stripped HTTP reads (#1842)#1851
Merged
Conversation
The stripped branch of _fetch_decode_cog_http_tiles called source.read_all() and sliced the decoded array, so a small windowed read of a stripped multi-billion-pixel COG still downloaded the whole raster and the caller's max_pixels was dropped in favour of MAX_PIXELS_DEFAULT (1B). That broke the safety contract the tiled branch got from #1664 and #1823. Move the work into a new _fetch_decode_cog_http_strips helper that builds the list of strip byte ranges intersecting the requested window, coalesces them via read_ranges_coalesced (same as the tiled path), and decodes each strip into a windowed output. max_pixels now bounds the materialised pixel count: the window for windowed reads, the full image otherwise. When window is None the path stays on read_all + _read_strips, but max_pixels is threaded through so the full-image dim check uses the caller's cap.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes stripped GeoTIFF HTTP reads so windowed requests avoid full-file reads and apply the caller’s max_pixels budget consistently with the tiled HTTP path.
Changes:
- Adds
_fetch_decode_cog_http_stripsfor HTTP range-based stripped TIFF reads. - Threads
max_pixelsinto stripped full-image HTTP reads. - Adds regression tests for byte-range behavior, window/full-image pixel caps, and window parity.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
xrspatial/geotiff/_reader.py |
Adds the stripped HTTP fetch/decode helper and routes non-tiled HTTP reads through it. |
xrspatial/geotiff/tests/test_http_stripped_window_max_pixels_issue_A_1842.py |
Adds regression coverage for stripped HTTP windowing and pixel safety behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+2032
to
+2046
| # Per-strip compressed-byte cap (#1664). Mirrors the local strip path | ||
| # and the tiled HTTP path: a crafted ``StripByteCounts`` entry can | ||
| # request an unbounded HTTP Range GET or decompress a few KiB into | ||
| # gigabytes. Apply the cap before any fetch list is built. | ||
| max_tile_bytes = _max_tile_bytes_from_env() | ||
| for _strip_idx, _bc in enumerate(byte_counts): | ||
| if _bc > max_tile_bytes: | ||
| raise ValueError( | ||
| f"TIFF strip {_strip_idx} declares " | ||
| f"StripByteCount={_bc:,} bytes, which exceeds the " | ||
| f"per-strip safety cap of {max_tile_bytes:,} bytes. " | ||
| f"The file is malformed or attempting denial-of-service. " | ||
| f"Override via XRSPATIAL_COG_MAX_TILE_BYTES if this file " | ||
| f"is legitimate." | ||
| ) |
| strip_rows = min(rps, height - strip_row) | ||
| if strip_rows <= 0: | ||
| continue | ||
|
|
…code-size guard Address two copilot reviews on #1851. The per-strip byte cap previously ran over every strip before the windowed path picked which to fetch, so a small window could be rejected because of an unrelated oversized strip; move the check into the fetch-range loop so it mirrors the tiled HTTP path. Separately, the windowed path passed width * strip_rows * strip_samples to the decoder without an absolute dimension cap, so a small window touching a very large strip could force an outsized decompression allocation. Add a per-strip MAX_PIXELS_DEFAULT guard before _decode_strip_or_tile, matching the per-tile guard in the tiled HTTP path.
Contributor
Author
|
Implemented both: per-strip byte cap now lives inside the windowed fetch-range loop (mirrors the tiled path), and decoded strip size goes through a MAX_PIXELS_DEFAULT guard before _decode_strip_or_tile. Regression tests added for both cases. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1842.
_fetch_decode_cog_http_tileshad anot ifd.is_tiledbranch that calledsource.read_all()and sliced the result. That dropped two safety contracts the tiled branch had from Apply tile byte cap to local files + SSRF hardening on HTTP geotiff reads #1664 and VRT source reads with small max_pixels reject normal-tile sources #1823: (1) the comment a few lines below promises windowed HTTP reads "only allocate the window", but stripped COGs were downloading the whole raster anyway; (2) the caller'smax_pixelswas silently swapped forMAX_PIXELS_DEFAULT(1B), so a 50x50 window read withmax_pixels=2500against a huge stripped file went through unchecked._fetch_decode_cog_http_stripshelper. For windowed reads it computes which strips intersect[r0:r1], fetches only those byte ranges viaread_ranges_coalesced(same coalescing logic as the tiled path), decodes each via_decode_strip_or_tile, and places each strip's intersection with the window into the output. Per-strip byte cap (Apply tile byte cap to local files + SSRF hardening on HTTP geotiff reads #1664), chunky vs planar layouts, and sparse strips all follow_read_strips. For full-image reads the path stays onread_all+_read_strips, butmax_pixelsis now threaded through.Test plan
xrspatial/geotiff/tests/test_http_stripped_window_max_pixels_issue_A_1842.py:read_allis never called.max_pixelshonoured: a 50x50 window withmax_pixels=2500succeeds against a 65,536-pixel source; the same window withmax_pixels=2499raisesPixelSafetyLimitError.window=Nonewithmax_pixels=100against a 65,536-pixel source raisesPixelSafetyLimitError.xrspatial/geotiff/tests/suite passes apart from 11 pre-existing failures on main (matplotlib + Python 3.14 deepcopy recursion, GPU tests without a GPU, an unrelated tile-size validation test); none of them touch the HTTP path.test_http_*,test_cog_http_*,test_reader.py,test_local_tile_byte_cap_1664.py,test_ssrf_hardening_1664.py) all pass: 106/106.