Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion infra/main.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -1112,7 +1112,7 @@ module containerAppBackend 'br/public:avm/res/app/container-app:0.19.0' = {
}
]
ingressTargetPort: 8000
ingressExternal: true
ingressExternal: !enablePrivateNetworking
scaleSettings: {
// maxReplicas: enableScalability ? 3 : 1
maxReplicas: 1 // maxReplicas set to 1 (not 3) due to multiple agents created per type during WAF deployment
Expand Down
9 changes: 9 additions & 0 deletions infra/main.parameters.json
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,15 @@
"vmAdminPassword": {
"value": "${AZURE_ENV_VM_ADMIN_PASSWORD}"
},
"enableMonitoring": {
"value": true
},
"enablePrivateNetworking": {
"value": true
},
Comment on lines +53 to +55
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description/expected behavior says backend Container App endpoints should be internal-only (not directly accessible from the public internet). This PR adjusts frontend routing/proxying, but the infrastructure still appears to deploy the backend Container App with external ingress enabled (ingressExternal: true in infra/main.bicep), which would keep the backend publicly reachable by its FQDN. If the goal is to prevent direct access, the infra likely also needs to disable external ingress (or add Private Endpoint/internal ingress) for the backend when WAF/private networking is enabled.

Copilot uses AI. Check for mistakes.
"enableScalability": {
"value": true
Comment on lines +51 to +57
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting enableMonitoring/enablePrivateNetworking/enableScalability to true in infra/main.parameters.json changes the documented "Development/Testing (Default)" sandbox configuration to a more production-like setup (higher cost and stricter networking). If the intent is to keep sandbox minimal, consider leaving these parameters out (so main.bicep defaults apply) or setting them to false, and rely on main.waf.parameters.json for the production defaults per docs/DeploymentGuide.md.

Suggested change
"value": true
},
"enablePrivateNetworking": {
"value": true
},
"enableScalability": {
"value": true
"value": false
},
"enablePrivateNetworking": {
"value": false
},
"enableScalability": {
"value": false

Copilot uses AI. Check for mistakes.
},
"aiModelDeployments": {
"value": [
{
Expand Down
2 changes: 1 addition & 1 deletion infra/main_custom.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -1063,7 +1063,7 @@ module containerAppBackend 'br/public:avm/res/app/container-app:0.19.0' = {
}
]
ingressTargetPort: 8000
ingressExternal: true
ingressExternal: !enablePrivateNetworking
scaleSettings: {
// maxReplicas: enableScalability ? 3 : 1
maxReplicas: 1 // maxReplicas set to 1 (not 3) due to multiple agents created per type during WAF deployment
Expand Down
115 changes: 112 additions & 3 deletions src/frontend/frontend_server.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,22 @@
import asyncio
import os

import httpx
import uvicorn
import websockets
from dotenv import load_dotenv
from fastapi import FastAPI
from fastapi import FastAPI, Request, WebSocket, WebSocketDisconnect
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import FileResponse
from fastapi.responses import FileResponse, Response
from fastapi.staticfiles import StaticFiles

# Load environment variables from .env file
load_dotenv()

# Internal backend URL used by the server-side proxy.
# The browser never contacts this URL directly.
BACKEND_API_URL = os.getenv("API_URL", "http://localhost:8000").rstrip("/")

app = FastAPI()

app.add_middleware(
Expand Down Expand Up @@ -38,7 +45,11 @@ async def serve_index():
@app.get("/config")
async def get_config():
config = {
"API_URL": os.getenv("API_URL", "API_URL not set"),
# Return empty string so the browser uses relative /api/* paths
# which are proxied server-side to BACKEND_API_URL. This ensures
# backend Container Apps with internal-only ingress are never
# contacted directly from the browser.
"API_URL": "",
Comment on lines +48 to +52
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /config response now always sets "API_URL" to an empty string. The React app also uses config.API_URL as a base URL (e.g., for its /health connectivity check), so returning "" can cause those calls to hit the frontend server instead of the backend/proxy. Consider returning a safe public base (e.g., derived from the request host) or updating the frontend config contract so consumers don't treat API_URL as a backend base URL.

Copilot uses AI. Check for mistakes.
"REACT_APP_MSAL_AUTH_CLIENTID": os.getenv(
"REACT_APP_MSAL_AUTH_CLIENTID", "Client ID not set"
),
Expand All @@ -56,6 +67,104 @@ async def get_config():
return config


# ---------------------------------------------------------------------------
# Reverse proxy: WebSocket (must be declared before the HTTP catch-all below)
# ---------------------------------------------------------------------------

@app.websocket("/api/socket/{batch_id}")
async def proxy_websocket(websocket: WebSocket, batch_id: str):
"""Proxy WebSocket connections from the browser to the internal backend."""
await websocket.accept()

backend_ws_url = (
BACKEND_API_URL
.replace("https://", "wss://")
.replace("http://", "ws://")
)
backend_ws_url = f"{backend_ws_url}/api/socket/{batch_id}"

try:
async with websockets.connect(backend_ws_url) as backend_ws:

async def forward_to_backend():
try:
while True:
data = await websocket.receive_text()
await backend_ws.send(data)
except (WebSocketDisconnect, Exception):
pass

async def forward_to_client():
try:
async for message in backend_ws:
await websocket.send_text(message)
except (WebSocketDisconnect, Exception):
pass

await asyncio.gather(forward_to_backend(), forward_to_client())
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WebSocket proxy uses asyncio.gather(forward_to_backend(), forward_to_client()). If either side exits (e.g., backend closes), the other coroutine can remain blocked (e.g., waiting on websocket.receive_text()), leaving the client connection open even though the backend is gone. Consider creating tasks and cancelling the remaining task when one completes, and ensure both websockets are closed with appropriate close codes.

Copilot uses AI. Check for mistakes.
except Exception:
pass
finally:
Comment on lines +94 to +107
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WebSocket proxy broadly catches Exception and silently passes (both in the per-direction loops and around the whole connection). This will mask backend connection failures and make diagnosing WAF/proxy issues difficult in production. Consider logging exceptions (at least at debug/info) and handling expected disconnect exceptions explicitly (e.g., ConnectionClosed) rather than swallowing everything.

Copilot uses AI. Check for mistakes.
try:
await websocket.close()
except Exception:
pass


# ---------------------------------------------------------------------------
# Reverse proxy: HTTP (all /api/* routes proxied to the internal backend)
# ---------------------------------------------------------------------------

_PROXY_CLIENT = httpx.AsyncClient(timeout=300.0)
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_PROXY_CLIENT is created as a module-level httpx.AsyncClient and never closed. Over time/reloads this can leak connections/file descriptors. Consider creating the client in a FastAPI lifespan/startup event and closing it on shutdown (or using a context manager per request if simplicity is preferred).

Copilot uses AI. Check for mistakes.


@app.api_route(
"/api/{path:path}",
methods=["GET", "POST", "PUT", "DELETE", "PATCH", "OPTIONS", "HEAD"],
)
async def proxy_api(request: Request, path: str):
"""Proxy HTTP API requests from the browser to the internal backend."""
target_url = f"{BACKEND_API_URL}/api/{path}"
if request.url.query:
target_url = f"{target_url}?{request.url.query}"

# Forward all headers except 'host' (would confuse the backend)
headers = {
k: v for k, v in request.headers.items()
if k.lower() != "host"
}

body = await request.body()

response = await _PROXY_CLIENT.request(
method=request.method,
url=target_url,
headers=headers,
content=body,
)

# Strip hop-by-hop headers that must not be forwarded
excluded_headers = {
"content-encoding", "transfer-encoding", "connection",
"keep-alive", "proxy-authenticate", "proxy-authorization",
"te", "trailers", "upgrade",
}
forwarded_headers = {
k: v for k, v in response.headers.items()
if k.lower() not in excluded_headers
}

return Response(
content=response.content,
status_code=response.status_code,
headers=forwarded_headers,
)


# ---------------------------------------------------------------------------
# SPA catch-all (must be last)
# ---------------------------------------------------------------------------

@app.get("/{full_path:path}")
async def serve_app(full_path: str):
# Remediation: normalize and check containment before serving
Expand Down
4 changes: 3 additions & 1 deletion src/frontend/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,6 @@ uvicorn[standard]
jinja2
azure-identity
python-dotenv
python-multipart
python-multipart
httpx
websockets
129 changes: 86 additions & 43 deletions src/frontend/src/api/WebSocketService.tsx
Original file line number Diff line number Diff line change
@@ -1,64 +1,107 @@
import { getApiUrl } from '../api/config';
import { getApiUrl, headerBuilder } from '../api/config';

// WebSocketService.ts
// Polling-based status stream service that preserves the existing event interface.
type EventHandler = (data: any) => void;

class WebSocketService {
private socket: WebSocket | null = null;
private pollInterval: ReturnType<typeof setInterval> | null = null;
private isConnected = false;
private activeBatchId: string | null = null;
private lastKnownStatus: Record<string, string> = {};
private eventHandlers: Record<string, EventHandler[]> = {};

connect(batch_id: string): void {
let apiUrl = getApiUrl();
console.log('API URL: websocket', apiUrl);
if (apiUrl) {
apiUrl = apiUrl.replace(/^https?/, match => match === "https" ? "wss" : "ws");
} else {
throw new Error('API URL is null');
private async pollBatchSummary(batchId: string): Promise<void> {
const apiUrl = getApiUrl();
if (!apiUrl) {
this._emit('error', new Error('API URL is null'));
return;
}
console.log('Connecting to WebSocket:', apiUrl);
if (this.socket) return; // Prevent duplicate connections
this.socket = new WebSocket(`${apiUrl}/socket/${batch_id}`);

this.socket.onopen = () => {
console.log('WebSocket connection opened.');
this._emit('open', undefined);
};

this.socket.onmessage = (event: MessageEvent) => {
try {
const data = JSON.parse(event.data);
this._emit('message', data);
} catch (err) {
console.error('Error parsing message:', err);

try {
const response = await fetch(`${apiUrl}/batch-summary/${batchId}`, {
headers: headerBuilder({}),
});

if (!response.ok) {
throw new Error(`Failed to fetch batch status: ${response.status}`);
}
};

this.socket.onerror = (error: Event) => {
console.error('WebSocket error:', error);
const payload = await response.json();
const files = payload?.files || [];
let allFilesTerminal = files.length > 0;

for (const file of files) {
const fileId = file?.file_id;
const status = (file?.status || '').toLowerCase();
if (!fileId || !status) {
continue;
}

if (!['completed', 'failed', 'error'].includes(status)) {
allFilesTerminal = false;
}

const previousStatus = this.lastKnownStatus[fileId];
if (previousStatus !== status) {
this.lastKnownStatus[fileId] = status;

this._emit('message', {
batch_id: batchId,
file_id: fileId,
agent_type: 'Polling agent',
agent_message: `Status changed to ${status}`,
process_status: status,
file_result: file?.file_result || null,
});
}
}

if (allFilesTerminal) {
this.disconnect();
}
} catch (error) {
this._emit('error', error);
};
}
}

connect(batch_id: string): void {
if (this.isConnected && this.activeBatchId === batch_id) return;

this.disconnect();

this.isConnected = true;
this.activeBatchId = batch_id;
this.lastKnownStatus = {};
this._emit('open', undefined);

this.socket.onclose = (event: CloseEvent) => {
console.log('WebSocket closed:', event);
this._emit('close', event);
this.socket = null;
};
// Poll once immediately, then at a fixed interval.
void this.pollBatchSummary(batch_id);
this.pollInterval = setInterval(() => {
if (this.isConnected && this.activeBatchId) {
void this.pollBatchSummary(this.activeBatchId);
}
}, 3000);
}

disconnect(): void {
if (this.socket) {
this.socket.close();
this.socket = null;
console.log('WebSocket connection closed manually.');
if (this.pollInterval) {
clearInterval(this.pollInterval);
this.pollInterval = null;
}

const wasConnected = this.isConnected;
this.isConnected = false;
this.activeBatchId = null;
this.lastKnownStatus = {};

if (wasConnected) {
this._emit('close', { reason: 'polling_stopped' });
}
}

send(data: any): void {
if (this.socket && this.socket.readyState === WebSocket.OPEN) {
this.socket.send(JSON.stringify(data));
} else {
console.error('WebSocket is not open. Cannot send:', data);
}
// Polling transport is read-only from client perspective.
console.debug('send() is ignored in polling mode:', data);
}

on(event: string, handler: EventHandler): void {
Expand Down
7 changes: 5 additions & 2 deletions src/frontend/src/api/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,11 @@ export function getApiUrl() {
}

if (!API_URL) {
console.warn('API URL not yet configured');
return null;
// API_URL is not configured (e.g. WAF deployment where the backend is
// internal-only). Fall back to the browser's own origin so that all
// /api/* requests are routed through the frontend server's reverse proxy
// instead of attempting to reach the internal backend URL directly.
return `${window.location.origin}/api`;
}

return API_URL;
Expand Down
3 changes: 1 addition & 2 deletions src/frontend/vite.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,9 @@ export default defineConfig({
'/api': {
target: 'http://localhost:8000',
changeOrigin: true,
rewrite: (path) => path.replace(/^\/api/, '')
},
'/config': {
target: 'http://localhost:8000',
target: 'http://localhost:3000',
changeOrigin: true
}
Comment on lines 14 to 17
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vite dev-server now proxies /config to http://localhost:3000. If the Python frontend_server isn't running locally, this proxy can cause /config to fail differently than before (proxy error vs. backend 404), and may prevent the app from falling back to defaultConfig depending on how the proxy error surfaces. Consider leaving /config unproxied (so it 404s and the frontend uses defaults), or making the /config proxy conditional via an env flag.

Copilot uses AI. Check for mistakes.
}
Expand Down