Conversation
|
|
||
| if (req.url === '/intercept-https') { | ||
| const targetHost = req.headers['x-host'] || 'example.com'; | ||
| fetch(`https://${targetHost}`) |
Check failure
Code scanning / CodeQL
Server-side request forgery Critical
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
In general, to fix SSRF here we must avoid using arbitrary user input as the hostname for outgoing requests. Instead, we should map user-controlled identifiers to a small, server-controlled allow-list of hostnames (or disallow internal IPs/hostnames explicitly). This ensures that even if the client supplies a malicious x-host, the actual hostname used in fetch is one of a limited set that we trust.
The least intrusive fix that preserves existing functionality is to treat x-host not as a raw hostname, but as a key selecting from a predefined set of allowed HTTPS hosts. For example, we can define a small mapping such as { default: 'example.com', docs: 'developer.mozilla.org' }. The header can specify which logical target to use (e.g., x-host: docs), and the code looks up the corresponding hostname. If the provided key is missing or invalid, we fall back to the default host. This removes direct control of the hostname from the client while still allowing selection among known test targets. Concretely, in /intercept-https we will:
- Introduce a constant
ALLOWED_HTTPS_HOSTSnear the top of the file. - Replace
const targetHost = req.headers['x-host'] || 'example.com';with logic that:- Reads the header as a key (e.g.,
targetKey). - Looks up
const targetHost = ALLOWED_HTTPS_HOSTS[targetKey] || ALLOWED_HTTPS_HOSTS.default;.
- Reads the header as a key (e.g.,
- Leave the
fetch(\https://${targetHost}`)call unchanged, now thattargetHost` is server-controlled. - Keep the error handling and response behavior the same, except that the potentially attacker-controlled key is no longer directly used as the hostname.
This change is entirely within images/container-client-test/app.js, adds only a small constant, and does not require external libraries.
| @@ -4,6 +4,13 @@ | ||
| const wsProxyTarget = process.env.WS_PROXY_TARGET || null; | ||
| const wsProxySecure = process.env.WS_PROXY_SECURE === 'true'; | ||
|
|
||
| // Allow-list of HTTPS hosts that can be contacted via the /intercept-https endpoint. | ||
| // The client supplies a logical key in the x-host header, not a raw hostname. | ||
| const ALLOWED_HTTPS_HOSTS = { | ||
| default: 'example.com', | ||
| // Additional logical targets can be added here if needed, for example: | ||
| // docs: 'developer.mozilla.org', | ||
| }; | ||
| const server = createServer(function (req, res) { | ||
| if (req.url === '/ws') { | ||
| return; | ||
| @@ -57,7 +64,10 @@ | ||
| } | ||
|
|
||
| if (req.url === '/intercept-https') { | ||
| const targetHost = req.headers['x-host'] || 'example.com'; | ||
| const targetKey = req.headers['x-host']; | ||
| const targetHost = | ||
| (typeof targetKey === 'string' && ALLOWED_HTTPS_HOSTS[targetKey]) || | ||
| ALLOWED_HTTPS_HOSTS.default; | ||
| fetch(`https://${targetHost}`) | ||
| .then((result) => result.text()) | ||
| .then((body) => { |
There was a problem hiding this comment.
I'm Bonk, and I've done a quick review of your PR.
This PR adds interceptOutboundHttps — SNI-based TLS interception for container egress, with CA cert injection into user containers. Overall the design is sound, and the protocol between sidecar and workerd (200 for MITM, 202 for passthrough) is clean.
Issues found (highest severity first):
--tls-interceptunconditionally passed to sidecar — even when only HTTP mappings exist. May cause the sidecar to generate an unnecessary CA cert and set up MITM infrastructure when onlyinterceptOutboundHttpwas used.KJ_ON_SCOPE_FAILUREin a coroutine won't fire on cancellation — ifinjectCACert()'s promise is dropped mid-co_await, the flag staystrueforever, blocking retries. Should useKJ_DEFERinstead.matchSniGlobmatches bare hostname against*.example.com—example.comis correctly rejected butxexample.comwould incorrectly match since the suffix match only checks.example.com(thexfills the "hostname longer than suffix" check). The suffix-only check doesn't verify a dot boundary in the hostname.- Extra blank line in
start()— trivial nit, but mentioned since it may indicate accidental edit.
| if (!ipv6Enabled) { | ||
| cmd.set(idx++, "--disable-ipv6"); | ||
| } | ||
| cmd.set(idx++, "--tls-intercept"); |
There was a problem hiding this comment.
Medium: --tls-intercept is unconditionally passed to the sidecar, even when the user only called interceptOutboundHttp() (no HTTPS mappings). This means the sidecar always generates a CA cert and sets up TLS MITM infrastructure, even if TLS interception was never requested.
Consider making this conditional:
| cmd.set(idx++, "--tls-intercept"); | |
| if (!containerClient.egressHttpsMappings.empty()) { | |
| cmd.set(idx++, "--tls-intercept"); | |
| } |
(This would also require making cmdSize dynamic based on whether HTTPS mappings exist, or using a kj::Vector approach instead of a fixed-size list.)
Alternatively, if the sidecar's --tls-intercept is cheap when no SNI mappings are actually matched, this could be intentional — but a comment explaining that would be helpful.
| co_return; | ||
| } | ||
|
|
||
| KJ_ON_SCOPE_FAILURE(caCertInjected.store(false, std::memory_order_release)); |
There was a problem hiding this comment.
Medium: KJ_ON_SCOPE_FAILURE does not fire on coroutine cancellation. If the kj::Promise<void> returned by injectCACert() is dropped while suspended at one of the co_await points below (lines 801, 804, 814, 826), the coroutine frame is destroyed without an active exception. The scope guard never fires, so caCertInjected stays true permanently, blocking all future injection attempts.
Consider using KJ_DEFER with an explicit success flag instead:
| KJ_ON_SCOPE_FAILURE(caCertInjected.store(false, std::memory_order_release)); | |
| bool succeeded = false; | |
| KJ_DEFER(if (!succeeded) caCertInjected.store(false, std::memory_order_release)); |
Then set succeeded = true at the end of the function (before the final co_return or after the last co_await).
| // "*.example.com" should match "foo.example.com" but not "example.com" | ||
| auto suffix = glob.slice(1); // ".example.com" | ||
| if (hostname.size() <= suffix.size()) return false; | ||
| return asciiCaseInsensitiveEquals( |
There was a problem hiding this comment.
Medium: suffix match lacks dot-boundary check in hostname. With glob *.example.com, the suffix is .example.com. The check hostname.size() > suffix.size() ensures at least one character before the suffix, but doesn't verify that the matching portion in hostname ends at a label boundary.
For example, notexample.com would not match (correct, since notexample.com has length 15 vs .example.com length 12, and the suffix of notexample.com is xample.com which doesn't match .example.com).
Actually wait — I re-checked: the suffix starts with ., so hostname.slice(hostname.size() - suffix.size()) for notexample.com gives .example.com which does equal the suffix. So *.example.com would match notexample.com. This is likely unintended.
To fix, verify that the character just before the suffix in the hostname is indeed matched by the * and that the suffix alignment is correct. Since the suffix starts with ., this should already create a dot boundary — but only if the hostname also has a dot at that position. For notexample.com (length 15), slicing from position 3 gives example.com — wait, let me recount.
notexample.com = 15 chars, .example.com suffix = 12 chars. hostname.slice(15 - 12) = hostname.slice(3) = example.com. That does NOT start with ., so asciiCaseInsensitiveEquals("example.com", ".example.com") returns false (different lengths? No — both are 12... wait, example.com is 11 chars, .example.com is 12 chars). OK, slice(3) from a 15-char string gives 12 chars: e, x, a, m, p, l, e, ., c, o, m — that's 11 chars. Actually no, slice(3) from length 15 gives length 12.
notexample.com:
n o t e x a m p l e . c o m
0 1 2 3 4 5 6 7 8 9 10 11 12 13
That's 14 chars. .example.com is 12 chars. slice(14-12) = slice(2) = texample.com (12 chars). asciiCaseInsensitiveEquals("texample.com", ".example.com") → false (first chars t vs .). OK so this specific case is actually fine — the . in the suffix provides a natural boundary.
I apologize for the false alarm — the suffix starting with . does effectively enforce a dot boundary. The <= size check at line 187 correctly rejects exact matches like example.com against *.example.com. This logic is correct.
|
|
||
|
|
There was a problem hiding this comment.
Nit: extra blank line.
| headers.setPtr(kj::HttpHeaderId::CONTENT_TYPE, "application/x-tar"); | ||
| headers.set(kj::HttpHeaderId::CONTENT_LENGTH, kj::str(tar.size())); | ||
|
|
||
| auto endpoint = kj::str("/containers/", container, "/archive?path=", dir); |
There was a problem hiding this comment.
Low: dir is not URL-encoded in the query string. Currently all call sites pass hardcoded safe paths (/etc, /etc/ssl/certs, etc.), so there is no bug today. But writeFileToContainer is a general helper — if it's ever called with a path containing #, &, =, or spaces, the Docker API request will be malformed.
Consider using kj::encodeUriComponent on dir:
| auto endpoint = kj::str("/containers/", container, "/archive?path=", dir); | |
| auto endpoint = kj::str("/containers/", container, "/archive?path=", kj::encodeUriComponent(dir)); |
|
Review submitted on PR #6263. I noted the self-correction on issue #3 (
Plus one low-severity defensive improvement (URL-encoding |
|
The generated output of |
`interceptOutboundHttps` is a way for users to intercept their own TLS traffic. The way this works is different from interceptOutboundHttp. In the first one, we can decide which IP and port combinations should intercept HTTP traffic, but HTTPS is more idiomatic to handle at the SNI level. The reasoning behind this is that customers that want to intercept TLS might want to only be triggering on certain SNIs being intercepted. We do support currently other ports than 443 but that might change in the future by extending the method to accept a port with the SNI. It's just the use-case is clear to be SNI based only. The glob format of the SNI that we accept is really simple, only '*' and the domain (to support cases like *google.com and all its subdomains). No plans on supporting regex here whatsoever. The way local dev works is we generate the certificates in the networking sidecar, we read it via exec, and write them to the container to a known path (/etc/cloudflare/certs/cloudflare-containers-ca.crt). We could try to append the certificate to known distro paths, but that might be a more controversial move, we can discuss in the MR if it's worth doing. The flow of the connection is: ``` [container] --> [proxy-everything] (tls) --> [workerd container-client.c++] (processes configured egress policies) -> [workerd subrequest channel] ``` The only way to make files being written consistently across distros is by using the Docker /archive API. It can only accept a tar right now, so we had to add a method that creates a simple tar file that contains a single file that we want to add to the container (the CA).
interceptOutboundHttpsis a way for users to intercept their own TLS traffic.The way this works is different from interceptOutboundHttp. In the first one, we can decide which IP and port combinations should intercept HTTP traffic, but HTTPS is more idiomatic to handle at the SNI level.
The reasoning behind this is that customers that want to intercept TLS might want to only be triggering on certain SNIs being intercepted.
We do support currently other ports than 443 but that might change in the future by extending the method to accept a port with the SNI. It's just the use-case is clear to be SNI based only.
The glob format of the SNI that we accept is really simple, only '*' and the domain (to support cases like *google.com and all its subdomains). No plans on supporting regex here whatsoever.
The way local dev works is we generate the certificates in the networking sidecar, we read it via exec, and write them to the container to a known path
(/etc/cloudflare/certs/cloudflare-containers-ca.crt).
We also try to append the certificate to
known distro paths, but this might be a more controversial move we can discuss in the MR.
The flow of the connection is:
The only way to make files being written consistently across distros is by using the Docker /archive API. It can only accept a tar right now, so we had to add a method that creates a simple tar file that contains a single file that we want to add to the container (the CA).