fix: WAF frontend-to-backend routing and internal backend access#414
fix: WAF frontend-to-backend routing and internal backend access#414Ashwal-Microsoft wants to merge 4 commits intodevfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to support the WAF deployment model by preventing browsers from calling the backend Container App endpoint directly, instead routing /api/* calls through the frontend server as a reverse proxy.
Changes:
- Updated frontend runtime config handling so the browser uses relative
/api/*paths (fallback towindow.location.origin) rather than a configured backend URL. - Added a FastAPI-based reverse proxy in the frontend server for both HTTP
/api/*and WebSocket/api/socket/{batch_id}traffic to the internal backend URL. - Adjusted local dev proxying and infra parameter defaults (and introduced an additional WAF parameters “copy” file).
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| src/frontend/vite.config.js | Adjusts dev-server proxy targets for /api and /config. |
| src/frontend/src/api/config.js | Changes API base URL resolution to fall back to same-origin /api. |
| src/frontend/requirements.txt | Adds Python deps needed for proxying (httpx/websockets). |
| src/frontend/frontend_server.py | Implements /api HTTP + WebSocket reverse proxy; changes /config API_URL behavior. |
| infra/main.waf.parameters copy.json | Adds a duplicate WAF parameters file (with a “copy” filename). |
| infra/main.parameters.json | Enables monitoring/private networking/scalability by default. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "value": true | ||
| }, | ||
| "enablePrivateNetworking": { | ||
| "value": true | ||
| }, | ||
| "enableScalability": { | ||
| "value": true |
There was a problem hiding this comment.
Setting enableMonitoring/enablePrivateNetworking/enableScalability to true in infra/main.parameters.json changes the documented "Development/Testing (Default)" sandbox configuration to a more production-like setup (higher cost and stricter networking). If the intent is to keep sandbox minimal, consider leaving these parameters out (so main.bicep defaults apply) or setting them to false, and rely on main.waf.parameters.json for the production defaults per docs/DeploymentGuide.md.
| "value": true | |
| }, | |
| "enablePrivateNetworking": { | |
| "value": true | |
| }, | |
| "enableScalability": { | |
| "value": true | |
| "value": false | |
| }, | |
| "enablePrivateNetworking": { | |
| "value": false | |
| }, | |
| "enableScalability": { | |
| "value": false |
| "enablePrivateNetworking": { | ||
| "value": true | ||
| }, |
There was a problem hiding this comment.
PR description/expected behavior says backend Container App endpoints should be internal-only (not directly accessible from the public internet). This PR adjusts frontend routing/proxying, but the infrastructure still appears to deploy the backend Container App with external ingress enabled (ingressExternal: true in infra/main.bicep), which would keep the backend publicly reachable by its FQDN. If the goal is to prevent direct access, the infra likely also needs to disable external ingress (or add Private Endpoint/internal ingress) for the backend when WAF/private networking is enabled.
| # Return empty string so the browser uses relative /api/* paths | ||
| # which are proxied server-side to BACKEND_API_URL. This ensures | ||
| # backend Container Apps with internal-only ingress are never | ||
| # contacted directly from the browser. | ||
| "API_URL": "", |
There was a problem hiding this comment.
The /config response now always sets "API_URL" to an empty string. The React app also uses config.API_URL as a base URL (e.g., for its /health connectivity check), so returning "" can cause those calls to hit the frontend server instead of the backend/proxy. Consider returning a safe public base (e.g., derived from the request host) or updating the frontend config contract so consumers don't treat API_URL as a backend base URL.
| except (WebSocketDisconnect, Exception): | ||
| pass | ||
|
|
||
| await asyncio.gather(forward_to_backend(), forward_to_client()) |
There was a problem hiding this comment.
The WebSocket proxy uses asyncio.gather(forward_to_backend(), forward_to_client()). If either side exits (e.g., backend closes), the other coroutine can remain blocked (e.g., waiting on websocket.receive_text()), leaving the client connection open even though the backend is gone. Consider creating tasks and cancelling the remaining task when one completes, and ensure both websockets are closed with appropriate close codes.
| except (WebSocketDisconnect, Exception): | ||
| pass | ||
|
|
||
| async def forward_to_client(): | ||
| try: | ||
| async for message in backend_ws: | ||
| await websocket.send_text(message) | ||
| except (WebSocketDisconnect, Exception): | ||
| pass | ||
|
|
||
| await asyncio.gather(forward_to_backend(), forward_to_client()) | ||
| except Exception: | ||
| pass | ||
| finally: |
There was a problem hiding this comment.
The WebSocket proxy broadly catches Exception and silently passes (both in the per-direction loops and around the whole connection). This will mask backend connection failures and make diagnosing WAF/proxy issues difficult in production. Consider logging exceptions (at least at debug/info) and handling expected disconnect exceptions explicitly (e.g., ConnectionClosed) rather than swallowing everything.
| # Reverse proxy: HTTP (all /api/* routes proxied to the internal backend) | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
| _PROXY_CLIENT = httpx.AsyncClient(timeout=300.0) |
There was a problem hiding this comment.
_PROXY_CLIENT is created as a module-level httpx.AsyncClient and never closed. Over time/reloads this can leak connections/file descriptors. Consider creating the client in a FastAPI lifespan/startup event and closing it on shutdown (or using a context manager per request if simplicity is preferred).
| '/config': { | ||
| target: 'http://localhost:8000', | ||
| target: 'http://localhost:3000', | ||
| changeOrigin: true | ||
| } |
There was a problem hiding this comment.
Vite dev-server now proxies /config to http://localhost:3000. If the Python frontend_server isn't running locally, this proxy can cause /config to fail differently than before (proxy error vs. backend 404), and may prevent the app from falling back to defaultConfig depending on how the proxy error surfaces. Consider leaving /config unproxied (so it 404s and the frontend uses defaults), or making the /config proxy conditional via an env flag.
| { | ||
| "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#", | ||
| "contentVersion": "1.0.0.0", | ||
| "parameters": { | ||
| "solutionName": { | ||
| "value": "${AZURE_ENV_NAME}" | ||
| }, | ||
| "location": { | ||
| "value": "${AZURE_LOCATION}" | ||
| }, | ||
| "deploymentType": { | ||
| "value": "${AZURE_ENV_MODEL_DEPLOYMENT_TYPE}" | ||
| }, | ||
| "gptModelName": { | ||
| "value": "${AZURE_ENV_GPT_MODEL_NAME}" | ||
| }, | ||
| "gptDeploymentCapacity": { | ||
| "value": "${AZURE_ENV_GPT_MODEL_CAPACITY}" | ||
| }, | ||
| "gptModelVersion": { | ||
| "value": "${AZURE_ENV_GPT_MODEL_VERSION}" | ||
| }, | ||
| "imageTag": { | ||
| "value": "${AZURE_ENV_IMAGE_TAG=latest}" | ||
| }, | ||
| "containerRegistryEndpoint": { | ||
| "value": "${AZURE_ENV_CONTAINER_REGISTRY_ENDPOINT=cmsacontainerreg.azurecr.io}" | ||
| }, | ||
| "existingLogAnalyticsWorkspaceId": { | ||
| "value": "${AZURE_ENV_EXISTING_LOG_ANALYTICS_WORKSPACE_RID}" | ||
| }, | ||
| "existingFoundryProjectResourceId": { | ||
| "value": "${AZURE_EXISTING_AIPROJECT_RESOURCE_ID}" | ||
| }, | ||
| "secondaryLocation": { | ||
| "value": "${AZURE_ENV_SECONDARY_LOCATION}" | ||
| }, | ||
| "azureAiServiceLocation": { | ||
| "value": "${AZURE_ENV_AI_SERVICE_LOCATION}" | ||
| }, | ||
| "vmSize": { | ||
| "value": "${AZURE_ENV_VM_SIZE}" | ||
| }, | ||
| "vmAdminUsername": { | ||
| "value": "${AZURE_ENV_VM_ADMIN_USERNAME}" | ||
| }, | ||
| "vmAdminPassword": { | ||
| "value": "${AZURE_ENV_VM_ADMIN_PASSWORD}" | ||
| }, | ||
| "enableMonitoring": { | ||
| "value": true | ||
| }, | ||
| "enablePrivateNetworking": { | ||
| "value": true | ||
| }, | ||
| "enableScalability": { | ||
| "value": true | ||
| }, | ||
| "aiModelDeployments": { | ||
| "value": [ | ||
| { | ||
| "name": "${AZURE_ENV_GPT_MODEL_NAME}", | ||
| "model": { | ||
| "name": "${AZURE_ENV_GPT_MODEL_NAME}", | ||
| "version": "${AZURE_ENV_GPT_MODEL_VERSION}" | ||
| }, | ||
| "sku": { | ||
| "name": "${AZURE_ENV_MODEL_DEPLOYMENT_TYPE}", | ||
| "capacity": "${AZURE_ENV_GPT_MODEL_CAPACITY}" | ||
| } | ||
| } | ||
| ] | ||
| } | ||
| } | ||
| } No newline at end of file |
There was a problem hiding this comment.
This new parameters file appears to be a duplicate of infra/main.waf.parameters.json but with a space/"copy" in the filename. Keeping this in-repo is likely accidental and can confuse users/tooling that discover parameter files. Consider removing it or renaming it to a meaningful, documented variant name.
| { | |
| "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#", | |
| "contentVersion": "1.0.0.0", | |
| "parameters": { | |
| "solutionName": { | |
| "value": "${AZURE_ENV_NAME}" | |
| }, | |
| "location": { | |
| "value": "${AZURE_LOCATION}" | |
| }, | |
| "deploymentType": { | |
| "value": "${AZURE_ENV_MODEL_DEPLOYMENT_TYPE}" | |
| }, | |
| "gptModelName": { | |
| "value": "${AZURE_ENV_GPT_MODEL_NAME}" | |
| }, | |
| "gptDeploymentCapacity": { | |
| "value": "${AZURE_ENV_GPT_MODEL_CAPACITY}" | |
| }, | |
| "gptModelVersion": { | |
| "value": "${AZURE_ENV_GPT_MODEL_VERSION}" | |
| }, | |
| "imageTag": { | |
| "value": "${AZURE_ENV_IMAGE_TAG=latest}" | |
| }, | |
| "containerRegistryEndpoint": { | |
| "value": "${AZURE_ENV_CONTAINER_REGISTRY_ENDPOINT=cmsacontainerreg.azurecr.io}" | |
| }, | |
| "existingLogAnalyticsWorkspaceId": { | |
| "value": "${AZURE_ENV_EXISTING_LOG_ANALYTICS_WORKSPACE_RID}" | |
| }, | |
| "existingFoundryProjectResourceId": { | |
| "value": "${AZURE_EXISTING_AIPROJECT_RESOURCE_ID}" | |
| }, | |
| "secondaryLocation": { | |
| "value": "${AZURE_ENV_SECONDARY_LOCATION}" | |
| }, | |
| "azureAiServiceLocation": { | |
| "value": "${AZURE_ENV_AI_SERVICE_LOCATION}" | |
| }, | |
| "vmSize": { | |
| "value": "${AZURE_ENV_VM_SIZE}" | |
| }, | |
| "vmAdminUsername": { | |
| "value": "${AZURE_ENV_VM_ADMIN_USERNAME}" | |
| }, | |
| "vmAdminPassword": { | |
| "value": "${AZURE_ENV_VM_ADMIN_PASSWORD}" | |
| }, | |
| "enableMonitoring": { | |
| "value": true | |
| }, | |
| "enablePrivateNetworking": { | |
| "value": true | |
| }, | |
| "enableScalability": { | |
| "value": true | |
| }, | |
| "aiModelDeployments": { | |
| "value": [ | |
| { | |
| "name": "${AZURE_ENV_GPT_MODEL_NAME}", | |
| "model": { | |
| "name": "${AZURE_ENV_GPT_MODEL_NAME}", | |
| "version": "${AZURE_ENV_GPT_MODEL_VERSION}" | |
| }, | |
| "sku": { | |
| "name": "${AZURE_ENV_MODEL_DEPLOYMENT_TYPE}", | |
| "capacity": "${AZURE_ENV_GPT_MODEL_CAPACITY}" | |
| } | |
| } | |
| ] | |
| } | |
| } | |
| } |
Purpose
When deploying these accelerators using the WAF deployment option, the expectation is that the backend APIs hosted on Container Apps are secured behind the WAF and are not directly accessible from the public internet. However, the current implementation leaves the Container App API endpoints publicly exposed, which undermines the security posture of the WAF deployment model.
Expected Behavior
The Web App (frontend) should remain publicly accessible through the WAF/Application Gateway.
All backend Container App API endpoints should be private and only accessible internally (e.g., via Container Apps Environment VNet integration, internal ingress, or Private Endpoints).
External users should not be able to directly call the backend Container App API endpoints from outside the network boundary.
Acceptance Criteria
When the WAF deployment option is selected, all backend Container App endpoints are configured with internal-only ingress (no external ingress).
The frontend remains publicly accessible through the WAF/Application Gateway.
No backend Container App API endpoint is reachable from the public internet when using the WAF deployment.
Existing functionality of all 4 accelerators (Deploy your AI Application, Container Migration, Content Processing, Code Modernization) is not impacted by the networking changes.
Container Apps Environment is configured with VNet integration and internal ingress for backend services.
Network Security Groups (NSGs) and traffic rules are validated to ensure only internal traffic reaches backend Container Apps.
Documentation is updated to reflect the WAF deployment architecture and network topology for Container App-based accelerators.
Does this introduce a breaking change?
Golden Path Validation
Deployment Validation
What to Check
Verify that the following are valid
Other Information