Skip to content

[https://nvbugs/5911304][fix] Add URL validation for media input loading#13680

Draft
yibinl-nvidia wants to merge 2 commits intoNVIDIA:mainfrom
yibinl-nvidia:fix/inputs-url-validation-5911304-minimal
Draft

[https://nvbugs/5911304][fix] Add URL validation for media input loading#13680
yibinl-nvidia wants to merge 2 commits intoNVIDIA:mainfrom
yibinl-nvidia:fix/inputs-url-validation-5911304-minimal

Conversation

@yibinl-nvidia
Copy link
Copy Markdown
Collaborator

@yibinl-nvidia yibinl-nvidia commented Apr 30, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added URL validation and security protections for remote media loading, including validation of HTTP(S) schemes and resolvable hostnames
    • Implemented controlled redirect handling with enforced maximum response size limits for image, video, and audio retrieval
  • Tests

    • Added comprehensive tests validating URL validation and secure media fetching

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@yibinl-nvidia yibinl-nvidia self-assigned this Apr 30, 2026
@yibinl-nvidia yibinl-nvidia force-pushed the fix/inputs-url-validation-5911304-minimal branch from 2d95873 to 3001dbf Compare April 30, 2026 23:03
@yibinl-nvidia
Copy link
Copy Markdown
Collaborator Author

/bot run

yibinl-nvidia and others added 2 commits May 1, 2026 01:28
…l fetches

User-supplied image_url/video_url/audio_url values were fetched via aiohttp
without scheme/IP validation, response size limits, or redirect-chain
validation. This allowed SSRF to private/loopback/cloud-metadata addresses,
DoS via oversized responses, and SSRF via redirect.

Add _validate_public_url() and _safe_aiohttp_get() helpers that:
- reject non-http(s) schemes and URLs resolving to non-global addresses
- validate every redirect hop before following
- cap response size at 200 MB and redirect chain at 5 hops

Apply to async_load_image, async_load_video, and async_load_audio (the
serving API surface). Sync paths and the file/data branches are unchanged.

Addresses NVBugs 5911304.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>
Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>
@yibinl-nvidia yibinl-nvidia force-pushed the fix/inputs-url-validation-5911304-minimal branch from 3001dbf to 1b36856 Compare May 1, 2026 01:28
@yibinl-nvidia
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #46459 [ run ] triggered by Bot. Commit: 1b36856 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #46459 [ run ] completed with state SUCCESS. Commit: 1b36856
/LLM/main/L0_MergeRequest_PR pipeline #36528 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@yibinl-nvidia yibinl-nvidia marked this pull request as ready for review May 1, 2026 15:32
@yibinl-nvidia yibinl-nvidia requested a review from a team as a code owner May 1, 2026 15:32
@yibinl-nvidia yibinl-nvidia requested review from moraxu and tijyojwad May 1, 2026 15:32
@yibinl-nvidia yibinl-nvidia changed the title [https://nvbugs/1234567][fix] Validate URLs and bound response size in async multimoda… [https://nvbugs/5911304][fix] Add URL validation for media input loading May 1, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 1, 2026

📝 Walkthrough

Walkthrough

Introduces SSRF and DoS protections for multimodal HTTP(S) fetching by implementing URL validation that rejects non-HTTP(S) schemes, requires resolvable hostnames, and blocks non-global IPs. Replaces direct downloads with a safe HTTP client that validates URLs at each hop, disables redirects by default, manually follows a limited set of redirect status codes, enforces maximum response size, and caps redirect loops. Adds comprehensive test coverage and CI integration.

Changes

Cohort / File(s) Summary
SSRF/DoS Protection Implementation
tensorrt_llm/inputs/utils.py
Adds URL validation with scheme, hostname, and IP address checks; introduces _safe_aiohttp_get for secure HTTP fetching with manual redirect handling (max 5 redirects), response size enforcement (max 100MB), and validation at each hop. Updates imports for networking utilities.
Test Coverage
tests/unittest/inputs/test_url_validation.py
New comprehensive pytest module testing URL validation (scheme, hostname, IP resolution), safe HTTP fetching (pre-request validation, redirect handling, response streaming, size limits, and redirect loop protection) using mocked aiohttp objects.
CI Configuration
tests/integration/test_lists/test-db/l0_a10.yml
Extends PyTorch pre-merge CI test selection to include the new unittest/inputs/test_url_validation.py test module.

Sequence Diagram

sequenceDiagram
    actor Client
    participant Validator as URL Validator
    participant Resolver as DNS/IP Resolver
    participant HTTPClient as HTTP Client
    participant Response as Response Handler

    Client->>Validator: Request URL (validate_public_url)
    Validator->>Validator: Check scheme (http/https only)
    Validator->>Validator: Extract hostname
    alt Invalid scheme or missing hostname
        Validator-->>Client: Raise error
    end
    
    Validator->>Resolver: Resolve hostname to IP
    alt Cannot resolve
        Resolver-->>Validator: Raise error
        Validator-->>Client: Raise error
    end
    
    Validator->>Validator: Check IP is public/global
    alt Private, loopback, or IMDS IP
        Validator-->>Client: Raise error
    end
    
    Validator-->>Client: URL valid ✓
    
    Client->>HTTPClient: _safe_aiohttp_get (redirects disabled)
    HTTPClient->>HTTPClient: Validate resolved URL
    alt URL invalid
        HTTPClient-->>Client: Raise error
    end
    
    HTTPClient->>Response: Fetch with no redirects
    alt Redirect response (301-308)
        Response->>Validator: Validate redirect target
        alt Target invalid
            Validator-->>Client: Raise error
        end
        alt Redirect count exceeds max
            Response-->>Client: Raise "Too many redirects"
        end
        Response->>HTTPClient: Follow redirect (manual)
    end
    
    Response->>Response: Stream response body
    alt Response exceeds max size
        Response-->>Client: Raise error
    end
    
    Response-->>Client: Return response bytes
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is largely a blank template with no substantive content filled in; Description, Test Coverage, and detailed checklist items are missing or empty. Fill in the Description section explaining the issue and solution, populate Test Coverage with specific tests (e.g., test_url_validation.py), and provide clear checklist details confirming compliance with coding guidelines and dependencies.
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: adding URL validation for media input loading, following the required format [NVBugs ID][type] Summary.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Review rate limit: 9/10 reviews remaining, refill in 6 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
tests/unittest/inputs/test_url_validation.py (1)

1-15: ⚡ Quick win

Add the required NVIDIA file header.

This new test module is missing the standard NVIDIA copyright/license header. As per coding guidelines, "All source files (.cpp, .h, .cu, .py) should contain an NVIDIA copyright header with the year of latest modification and Apache 2.0 license notice."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unittest/inputs/test_url_validation.py` around lines 1 - 15, Add the
standard NVIDIA copyright/license header to the top of this new test module
(tests/unittest/inputs/test_url_validation.py); include the year of latest
modification and the full Apache 2.0 notice required for all source files,
placing it above the module docstring so the file begins with the NVIDIA header
followed by the existing docstring and imports; no code changes to
_MAX_RESPONSE_BYTES, _safe_aiohttp_get, or _validate_public_url are needed—only
add the required header text.
tensorrt_llm/inputs/utils.py (1)

1-13: ⚡ Quick win

Add the required NVIDIA file header.

This modified Python file still lacks the repository-required copyright/license header at the top. As per coding guidelines, "All source files (.cpp, .h, .cu, .py) should contain an NVIDIA copyright header with the year of latest modification and Apache 2.0 license notice."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/inputs/utils.py` around lines 1 - 13, Add the required NVIDIA
copyright/license header at the very top of the Python source (above the first
import statement "import asyncio") so the file contains the repo-standard NVIDIA
header with the latest modification year and Apache-2.0 notice; insert the exact
header text used across the repo (copy from another source file) and ensure it
precedes the module-level imports and any module-level docstring or code in this
file (symbols to look for: the file's top-level imports such as "import
asyncio", "import base64", etc.).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tensorrt_llm/inputs/utils.py`:
- Around line 148-160: Add explicit connect and read timeouts to the HTTP
request to prevent stalled or trickling responses from exhausting the session:
when calling session.get(...) (the request around the async with
session.get(current, allow_redirects=False) block) pass an aiohttp timeout
(ClientTimeout) that sets reasonable sock_connect and sock_read (and optionally
total) values, and ensure those timeouts apply to redirects handling; keep the
existing response size guard around response.content.iter_chunked and raise as
before if exceeded.
- Around line 143-148: The SSRF validation in _safe_aiohttp_get is vulnerable to
DNS rebinding because _validate_public_url(current) checks the hostname but
session.get(current) performs its own DNS resolution; fix by resolving the
hostname once via _validate_public_url (or a new resolver helper) to obtain the
concrete IP/port and then ensure the aiohttp request uses that same IP for the
TCP connection (e.g., create a custom aiohttp.TCPConnector/Resolver that pins
the resolved IP or call session.get() with the resolved IP and set the Host
header to the original hostname). Update _safe_aiohttp_get to pass the pinned IP
into the connector/request so DNS cannot be re-resolved, and keep redirect
validation logic (including calls to _validate_public_url and _MAX_REDIRECTS)
intact.

In `@tests/unittest/inputs/test_url_validation.py`:
- Around line 88-93: The test double _FakeSession.get currently ignores kwargs
so regressions where _safe_aiohttp_get stops passing allow_redirects=False won’t
be caught; update _FakeSession.get to assert that kwargs contains
allow_redirects set to False (or raise/record an error if missing/True) before
returning the next response so the unit test enforces the security contract and
fails if _safe_aiohttp_get no longer passes allow_redirects=False.

---

Nitpick comments:
In `@tensorrt_llm/inputs/utils.py`:
- Around line 1-13: Add the required NVIDIA copyright/license header at the very
top of the Python source (above the first import statement "import asyncio") so
the file contains the repo-standard NVIDIA header with the latest modification
year and Apache-2.0 notice; insert the exact header text used across the repo
(copy from another source file) and ensure it precedes the module-level imports
and any module-level docstring or code in this file (symbols to look for: the
file's top-level imports such as "import asyncio", "import base64", etc.).

In `@tests/unittest/inputs/test_url_validation.py`:
- Around line 1-15: Add the standard NVIDIA copyright/license header to the top
of this new test module (tests/unittest/inputs/test_url_validation.py); include
the year of latest modification and the full Apache 2.0 notice required for all
source files, placing it above the module docstring so the file begins with the
NVIDIA header followed by the existing docstring and imports; no code changes to
_MAX_RESPONSE_BYTES, _safe_aiohttp_get, or _validate_public_url are needed—only
add the required header text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 49c68b50-2fcd-4d82-a99c-098b8e0b275c

📥 Commits

Reviewing files that changed from the base of the PR and between 24bd7a2 and 1b36856.

📒 Files selected for processing (3)
  • tensorrt_llm/inputs/utils.py
  • tests/integration/test_lists/test-db/l0_a10.yml
  • tests/unittest/inputs/test_url_validation.py

Comment on lines +143 to +148
async def _safe_aiohttp_get(url: str, session: aiohttp.ClientSession) -> bytes:
"""Fetch *url*, validating each redirect hop and capping response size."""
current = url
for _ in range(_MAX_REDIRECTS + 1):
await asyncio.to_thread(_validate_public_url, current)
async with session.get(current, allow_redirects=False) as response:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n tensorrt_llm/inputs/utils.py | head -200

Repository: NVIDIA/TensorRT-LLM

Length of output: 9041


🏁 Script executed:

python3 << 'EOF'
# Research aiohttp's DNS handling behavior
import aiohttp
import inspect

# Check what connector aiohttp uses by default
session = aiohttp.ClientSession()
print("Default connector:", session.connector)
print("Connector type:", type(session.connector))

# Check if there's DNS caching
try:
    print("\nConnector attributes:")
    print(dir(session.connector))
except Exception as e:
    print(f"Error: {e}")

EOF

Repository: NVIDIA/TensorRT-LLM

Length of output: 354


🏁 Script executed:

# Check if there are any DNS-related configurations or mitigations in the codebase
rg -i "dns|resolver|connector|resolve" tensorrt_llm/inputs/utils.py

Repository: NVIDIA/TensorRT-LLM

Length of output: 702


🏁 Script executed:

# Search for aiohttp usage patterns in the codebase
rg -A 5 -B 5 "aiohttp.ClientSession\|session.get\|session.post" tensorrt_llm/inputs/utils.py | head -100

Repository: NVIDIA/TensorRT-LLM

Length of output: 45


🏁 Script executed:

# Search for aiohttp documentation or comments about DNS handling
rg -i "resolver|connector" tensorrt_llm/ | grep -E "\.py:" | head -20

Repository: NVIDIA/TensorRT-LLM

Length of output: 2032


🏁 Script executed:

# Check if there are any custom connectors or resolver configurations
fd -e py -x grep -l "TCPConnector\|resolver\|Resolver" {} \; | head -10

Repository: NVIDIA/TensorRT-LLM

Length of output: 537


🏁 Script executed:

# Look at aiohttp's connector behavior - check what version is used
grep -r "aiohttp" tensorrt_llm/requirements.txt pyproject.toml setup.py 2>/dev/null | head -20

Repository: NVIDIA/TensorRT-LLM

Length of output: 45


🏁 Script executed:

# Check if there's a requirements file with aiohttp version
find . -name "requirements*.txt" -o -name "pyproject.toml" -o -name "setup.py" | head -5 | xargs grep -h "aiohttp" 2>/dev/null | head -10

Repository: NVIDIA/TensorRT-LLM

Length of output: 45


🏁 Script executed:

# Search for any DNS rebinding or SSRF-related security discussions in the repo
rg -i "dns.rebind|rebind|ssrf" tensorrt_llm/ --type py | head -20

Repository: NVIDIA/TensorRT-LLM

Length of output: 647


🏁 Script executed:

# Look at the full context around the _safe_aiohttp_get function to see if there are other mitigations
sed -n '140,165p' tensorrt_llm/inputs/utils.py

Repository: NVIDIA/TensorRT-LLM

Length of output: 1165


🏁 Script executed:

# Check if there's any DNS resolver configuration for the aiohttp session creation
rg -B 10 -A 10 "_get_aiohttp_session" tensorrt_llm/inputs/utils.py

Repository: NVIDIA/TensorRT-LLM

Length of output: 3308


🏁 Script executed:

# Research if aiohttp has built-in DNS caching that would prevent rebinding
python3 << 'EOF'
# Check aiohttp documentation for DNS behavior
# By default, aiohttp.TCPConnector uses asyncio's getaddrinfo with no explicit DNS caching
# This means each connection attempt will perform a new DNS lookup

# The vulnerability window:
# 1. Line 147: socket.getaddrinfo() validates the hostname
# 2. Between lines 147-148: DNS TTL expires or cache is cleared
# 3. Line 148: aiohttp.session.get() performs another getaddrinfo() which gets a different IP

# This is a classic DNS rebinding attack vector:
# Attacker controls dns.example.com
# Step 1 (validation): dns.example.com → 8.8.8.8 (public IP) ✓ passes validation
# Step 2 (connection): dns.example.com → 127.0.0.1 (private IP) ✗ bypasses validation

print("DNS Rebinding Vulnerability Analysis:")
print("=" * 50)
print("The vulnerability is technically valid because:")
print("1. socket.getaddrinfo() and aiohttp's resolver are separate")
print("2. No DNS result pinning between validation and connection")
print("3. Attacker can control DNS responses with low TTL")
print("4. TOCTOU (Time-of-check to time-of-use) vulnerability")
EOF

Repository: NVIDIA/TensorRT-LLM

Length of output: 425


🏁 Script executed:

# Check if there are any mitigations or discussions about this in comments or issues
rg -B 5 -A 5 "NVBugs 5911304" tensorrt_llm/inputs/utils.py

Repository: NVIDIA/TensorRT-LLM

Length of output: 344


🏁 Script executed:

# Verify the actual implementation to confirm the vulnerability is real
sed -n '122,162p' tensorrt_llm/inputs/utils.py

Repository: NVIDIA/TensorRT-LLM

Length of output: 1899


DNS rebinding allows bypassing the SSRF validation.

Line 147 validates the hostname resolves to a public IP, but line 148's session.get() performs a separate DNS resolution that could return a different address. An attacker controlling DNS can return a public address during validation and a private/loopback address during connect, defeating the SSRF protection. The validated IP must be pinned to the actual socket connection via a custom connector, or the request must use the resolved address directly instead of the hostname.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/inputs/utils.py` around lines 143 - 148, The SSRF validation in
_safe_aiohttp_get is vulnerable to DNS rebinding because
_validate_public_url(current) checks the hostname but session.get(current)
performs its own DNS resolution; fix by resolving the hostname once via
_validate_public_url (or a new resolver helper) to obtain the concrete IP/port
and then ensure the aiohttp request uses that same IP for the TCP connection
(e.g., create a custom aiohttp.TCPConnector/Resolver that pins the resolved IP
or call session.get() with the resolved IP and set the Host header to the
original hostname). Update _safe_aiohttp_get to pass the pinned IP into the
connector/request so DNS cannot be re-resolved, and keep redirect validation
logic (including calls to _validate_public_url and _MAX_REDIRECTS) intact.

Comment on lines +148 to +160
async with session.get(current, allow_redirects=False) as response:
if response.status in _REDIRECT_STATUSES:
current = urljoin(current, response.headers.get("Location", ""))
continue
response.raise_for_status()
buf = BytesIO()
total = 0
async for chunk in response.content.iter_chunked(1 << 20):
total += len(chunk)
if total > _MAX_RESPONSE_BYTES:
raise RuntimeError("Response exceeds maximum allowed size")
buf.write(chunk)
return buf.getvalue()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

fd -t f "utils.py" | grep inputs

Repository: NVIDIA/TensorRT-LLM

Length of output: 92


🏁 Script executed:

cat tensorrt_llm/inputs/utils.py

Repository: NVIDIA/TensorRT-LLM

Length of output: 45677


Add timeout parameters to prevent DoS from stalled or trickling remote servers.

The code caps response bytes but lacks connect/read timeouts on the aiohttp session. A hostile server can indefinitely stall the handshake or trickle chunks to keep the shared session occupied, making the DoS mitigation incomplete.

Suggested fix
+# Keep slow or stalled remote media fetches from hanging forever.
+_AIOHTTP_TIMEOUT = aiohttp.ClientTimeout(
+    total=30,
+    connect=10,
+    sock_connect=10,
+    sock_read=10,
+)
+
 async def _safe_aiohttp_get(url: str, session: aiohttp.ClientSession) -> bytes:
     """Fetch *url*, validating each redirect hop and capping response size."""
     current = url
     for _ in range(_MAX_REDIRECTS + 1):
         await asyncio.to_thread(_validate_public_url, current)
-        async with session.get(current, allow_redirects=False) as response:
+        async with session.get(
+                current,
+                allow_redirects=False,
+                timeout=_AIOHTTP_TIMEOUT) as response:
             if response.status in _REDIRECT_STATUSES:
                 current = urljoin(current, response.headers.get("Location", ""))
                 continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/inputs/utils.py` around lines 148 - 160, Add explicit connect
and read timeouts to the HTTP request to prevent stalled or trickling responses
from exhausting the session: when calling session.get(...) (the request around
the async with session.get(current, allow_redirects=False) block) pass an
aiohttp timeout (ClientTimeout) that sets reasonable sock_connect and sock_read
(and optionally total) values, and ensure those timeouts apply to redirects
handling; keep the existing response size guard around
response.content.iter_chunked and raise as before if exceeded.

Comment on lines +88 to +93
class _FakeSession:
def __init__(self, responses):
self._responses = list(responses)

def get(self, url, **kwargs):
return self._responses.pop(0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, find the file
find . -name "test_url_validation.py" -type f

Repository: NVIDIA/TensorRT-LLM

Length of output: 110


🏁 Script executed:

# Read the file to understand the context
cat -n tests/unittest/inputs/test_url_validation.py | head -120

Repository: NVIDIA/TensorRT-LLM

Length of output: 4831


🏁 Script executed:

# Search for _safe_aiohttp_get to understand how it's being tested
rg -n "_safe_aiohttp_get" tests/unittest/inputs/test_url_validation.py -A 5 -B 2

Repository: NVIDIA/TensorRT-LLM

Length of output: 2184


🏁 Script executed:

# Find the utils file containing _safe_aiohttp_get
find . -name "utils.py" -path "*/inputs/*" -type f

Repository: NVIDIA/TensorRT-LLM

Length of output: 94


🏁 Script executed:

# Read the implementation
cat -n tensorrt_llm/inputs/utils.py | grep -A 30 "def _safe_aiohttp_get"

Repository: NVIDIA/TensorRT-LLM

Length of output: 1524


Add assertion to test double enforcing allow_redirects=False.

_FakeSession.get() currently ignores kwargs, so tests won't catch a regression if _safe_aiohttp_get() stops passing allow_redirects=False. This parameter is critical to the security contract: it ensures per-hop URL validation instead of allowing aiohttp to auto-follow redirects unsafely, defeating SSRF protections.

Tighten the test double
 class _FakeSession:
     def __init__(self, responses):
         self._responses = list(responses)
 
     def get(self, url, **kwargs):
+        assert kwargs.get("allow_redirects") is False
         return self._responses.pop(0)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class _FakeSession:
def __init__(self, responses):
self._responses = list(responses)
def get(self, url, **kwargs):
return self._responses.pop(0)
class _FakeSession:
def __init__(self, responses):
self._responses = list(responses)
def get(self, url, **kwargs):
assert kwargs.get("allow_redirects") is False
return self._responses.pop(0)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unittest/inputs/test_url_validation.py` around lines 88 - 93, The test
double _FakeSession.get currently ignores kwargs so regressions where
_safe_aiohttp_get stops passing allow_redirects=False won’t be caught; update
_FakeSession.get to assert that kwargs contains allow_redirects set to False (or
raise/record an error if missing/True) before returning the next response so the
unit test enforces the security contract and fails if _safe_aiohttp_get no
longer passes allow_redirects=False.

@yibinl-nvidia yibinl-nvidia marked this pull request as draft May 1, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants