feat: add first-class sandbox-to-sandbox network policy primitive

## Problem Statement

Multi-agent systems that run each agent component (reasoning, orchestration, action execution) in its own isolated sandbox need to communicate directly between sandboxes. Today, OpenShell's default-deny network policy blocks all inter-sandbox egress, and there is no first-class policy primitive to express sandbox-to-sandbox intent. Operators must manually determine ephemeral pod IPs and write raw CIDR rules — an impractical requirement given that pod IPs change on rescheduling.

## Technical Context

The SSRF enforcement layer in the proxy already supports RFC1918 destinations via `allowed_ips` on a `NetworkEndpointDef`. The gap is not in enforcement — it is in the policy authoring surface and the gateway's failure to automatically resolve and inject a peer sandbox's current pod IP at policy-load time. An operator who manually writes the correct `allowed_ips` CIDR today can already reach another sandbox pod through the proxy. What does not exist is the abstraction that makes this declarative and durable across pod rescheduling.

## Affected Components

| Component | Key Files | Role |
|-----------|-----------|------|
| Policy engine | `crates/openshell-policy/src/lib.rs` | `NetworkEndpointDef` schema — where `target_sandbox` field would be added |
| Proxy SSRF enforcement | `crates/openshell-supervisor-network/src/proxy.rs` | Evaluates OPA decision + resolves IPs; already handles `allowed_ips` path |
| Policy composition | `crates/openshell-policy/src/compose.rs` | Where gateway resolves sandbox pod IP and injects into policy before sending to supervisor |
| OPA input construction | `crates/openshell-supervisor-network/src/opa.rs` | Builds the input document for per-request evaluation; `allowed_ips` is set here |
| nftables namespace rules | `crates/openshell-supervisor-process/src/netns/nft_ruleset.rs` | Applies inside sandbox network namespace — does not need to change |

## Technical Investigation

### Architecture Overview

Sandbox egress is enforced at two independent layers:

**Layer 1 — OPA + SSRF (primary, per-request, in the proxy process):**
`handle_tcp_connection` in `proxy.rs` (line ~700) evaluates each CONNECT request through OPA, then resolves the destination through one of four SSRF paths depending on what the matching policy rule declares. The relevant path for RFC1918 is `resolve_and_check_allowed_ips` (line 2574): if the policy rule has `allowed_ips` populated, the resolved IP must fall within those CIDRs. Without `allowed_ips`, the fallback `resolve_and_reject_internal` (line 2556) calls `is_internal_ip`, which blocks all RFC1918 space — this is where sandbox-to-sandbox traffic dies today.

**Layer 2 — nftables (defence-in-depth, namespace-level):**
`nft_ruleset.rs` installs an `output` chain inside the sandbox's private network namespace that accepts only traffic destined for the proxy's veth IP and rejects everything else. This means sandbox processes cannot bypass the proxy to dial other pods directly. This layer does not need to change — inter-sandbox traffic still flows through the proxy, which then dials the remote sandbox pod from the host network namespace.

**Why nftables does not block the fix:** The proxy runs in the host network namespace and is not subject to the sandbox's nftables rules. Once OPA+SSRF clears the connection, the proxy dials the destination directly and has full access to cluster routing.

### Code References

| Location | Description |
|----------|-------------|
| `crates/openshell-policy/src/lib.rs:89` | `NetworkEndpointDef` struct — `target_sandbox` field would be added here |
| `crates/openshell-supervisor-network/src/proxy.rs:700` | `handle_tcp_connection` — main CONNECT decision tree |
| `crates/openshell-supervisor-network/src/proxy.rs:2556` | `resolve_and_reject_internal` — rejects RFC1918 on the default path |
| `crates/openshell-supervisor-network/src/proxy.rs:2574` | `resolve_and_check_allowed_ips` — permits RFC1918 when `allowed_ips` is declared |
| `crates/openshell-supervisor-network/src/opa.rs:1060` | OPA input construction — where `ep["allowed_ips"]` is set |
| `crates/openshell-supervisor-process/src/netns/nft_ruleset.rs` | Namespace-level nftables rules (no change needed) |
| `crates/openshell-core/src/net.rs` | `is_always_blocked_ip`, `is_internal_ip`, `is_always_blocked_net` |

### Current Behavior

When sandbox A issues a CONNECT to sandbox B's pod IP (a RFC1918 address), OPA may allow the host, but the SSRF fallback path `resolve_and_reject_internal` classifies the resolved IP as internal and returns a 403. The operator has no policy YAML primitive to express "allow traffic to sandbox B" — they must know the current pod IP/CIDR and write a raw `allowed_ips` rule, which becomes stale when sandbox B is rescheduled.

### What Would Need to Change

1. **Policy schema** (`crates/openshell-policy/src/lib.rs:89`): Add optional `target_sandbox_id: Option<String>` to `NetworkEndpointDef`. This is the new declarative primitive.

2. **Proto** (`proto/sandbox_policy.proto`): Add `target_sandbox_id` field to `NetworkEndpoint`. Wire format change — backwards compatible (optional field).

3. **Gateway policy composition** (`crates/openshell-policy/src/compose.rs`): When composing policy to send to a supervisor, resolve the current pod IP of any `target_sandbox_id` endpoint via the compute driver / K8s API and inject it as `allowed_ips`. The supervisor's existing config polling loop already re-fetches policy on changes, so rescheduled sandboxes will get updated IPs within one poll interval.

4. **OPA input construction** (`crates/openshell-supervisor-network/src/opa.rs:1060`): No change needed — `allowed_ips` already flows through this path.

5. **CLI/SDK**: Add UX for authoring `target_sandbox` rules (policy YAML authoring, `generate-sandbox-policy` skill support).

### Alternative Approaches Considered

- **Manual `allowed_ips` today**: Already works. Unblocks single-pod setups immediately but is not operationally durable across rescheduling. Documents as a workaround while this is built.
- **DNS-based resolution (no pod IP injection)**: Use the sandbox's Kubernetes Service DNS name as the `host` and rely on the proxy's DNS resolution. This avoids pod IP tracking but requires each sandbox to have a stable Service. Viable for long-lived sandboxes; less useful for ephemeral ones.

### Patterns to Follow

- `NetworkEndpointDef` field additions follow the existing optional-field pattern (`allowed_ips`, `protocols`).
- Policy composition already has a pattern for gateway-side enrichment before sending to supervisor — follow the same pipeline.
- The `BLOCKED_CONTROL_PLANE_PORTS` list (line 2608 in proxy.rs) must remain enforced even on the `target_sandbox` path.

## Proposed Approach

Add `target_sandbox_id` as an optional field on `NetworkEndpointDef` in policy YAML and proto. At policy composition time, the gateway resolves the target sandbox's current pod IP via the compute driver and injects it as `allowed_ips`. The proxy's existing `resolve_and_check_allowed_ips` SSRF path handles enforcement with no changes. The supervisor's config polling loop ensures pod IP updates propagate when a sandbox is rescheduled.

## Scope Assessment

- **Complexity:** Medium
- **Confidence:** High — the enforcement layer already works; this is plumbing and schema work
- **Estimated files to change:** ~6–8
- **Issue type:** `feat`

## Risks & Open Questions

- **Pod IP staleness window**: Between a sandbox rescheduling and the next config poll, the injected `allowed_ips` is stale. The poll interval determines the outage window. Should the gateway proactively push a policy update when it detects a sandbox pod IP change?
- **Circular dependency**: If sandbox A's policy depends on sandbox B's IP, and sandbox B depends on sandbox A's IP, both need to be resolved before either can start. Is this a real scenario and does it need a resolution order?
- **`is_always_blocked_ip` enforcement**: Loopback and link-local remain blocked even via `allowed_ips`. Confirm this is correct for inter-sandbox traffic (it should be — those addresses are never a sandbox pod IP).
- **`BLOCKED_CONTROL_PLANE_PORTS`**: The control-plane port blocklist must remain enforced on the `target_sandbox` path. Confirm no sandbox-to-sandbox use case requires etcd/kubelet ports.

## Test Considerations

- Unit tests for the new `target_sandbox_id` → `allowed_ips` injection in policy composition
- Unit tests in `proxy.rs` mirroring the existing `resolve_and_check_allowed_ips` coverage (lines 3471–3600) for the sandbox-targeted path
- E2e test: sandbox A reaches a service on sandbox B after policy with `target_sandbox_id` is applied (requires `test:e2e` coverage)
- Test the staleness scenario: verify that after sandbox B is rescheduled, the next config poll restores connectivity

---
*Created by spike investigation. Use `build-from-issue` to plan and implement.*

Location	Description
`crates/openshell-policy/src/lib.rs:89`	`NetworkEndpointDef` struct — `target_sandbox` field would be added here
`crates/openshell-supervisor-network/src/proxy.rs:700`	`handle_tcp_connection` — main CONNECT decision tree
`crates/openshell-supervisor-network/src/proxy.rs:2556`	`resolve_and_reject_internal` — rejects RFC1918 on the default path
`crates/openshell-supervisor-network/src/proxy.rs:2574`	`resolve_and_check_allowed_ips` — permits RFC1918 when `allowed_ips` is declared
`crates/openshell-supervisor-network/src/opa.rs:1060`	OPA input construction — where `ep["allowed_ips"]` is set
`crates/openshell-supervisor-process/src/netns/nft_ruleset.rs`	Namespace-level nftables rules (no change needed)
`crates/openshell-core/src/net.rs`	`is_always_blocked_ip`, `is_internal_ip`, `is_always_blocked_net`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add first-class sandbox-to-sandbox network policy primitive #1993

Problem Statement

Technical Context

Affected Components

Technical Investigation

Architecture Overview

Code References

Current Behavior

What Would Need to Change

Alternative Approaches Considered

Patterns to Follow

Proposed Approach

Scope Assessment

Risks & Open Questions

Test Considerations

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Component	Key Files	Role
Policy engine	`crates/openshell-policy/src/lib.rs`	`NetworkEndpointDef` schema — where `target_sandbox` field would be added
Proxy SSRF enforcement	`crates/openshell-supervisor-network/src/proxy.rs`	Evaluates OPA decision + resolves IPs; already handles `allowed_ips` path
Policy composition	`crates/openshell-policy/src/compose.rs`	Where gateway resolves sandbox pod IP and injects into policy before sending to supervisor
OPA input construction	`crates/openshell-supervisor-network/src/opa.rs`	Builds the input document for per-request evaluation; `allowed_ips` is set here
nftables namespace rules	`crates/openshell-supervisor-process/src/netns/nft_ruleset.rs`	Applies inside sandbox network namespace — does not need to change

Uh oh!

feat: add first-class sandbox-to-sandbox network policy primitive #1993

Description

Problem Statement

Technical Context

Affected Components

Technical Investigation

Architecture Overview

Code References

Current Behavior

What Would Need to Change

Alternative Approaches Considered

Patterns to Follow

Proposed Approach

Scope Assessment

Risks & Open Questions

Test Considerations

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions