Add security threat model and wire AGENTS.md -> SECURITY.md -> THREAT_MODEL.md#5268
Conversation
…_MODEL.md Adds a v0 security threat model (THREAT_MODEL.md), fills the previously-empty SECURITY.md with a disclosure pointer to it, and adds a Security section to AGENTS.md so the AGENTS.md -> SECURITY.md -> THREAT_MODEL.md discoverability chain resolves for automated tooling. The threat model is a provenance-tagged v0 draft for the PMC to review (see the open questions in its section 14). No existing developer guidance in AGENTS.md is changed. Generated-by: Claude Code (Claude Opus 4.8)
jongyoul
left a comment
There was a problem hiding this comment.
Thanks — the framing is right (RBAC is the boundary, not a sandbox; a %sh from a run-capable user is the product working, not RCE). Please keep it.
I left comments for the answer to §14.
| 1. **Anonymous default.** Proposed: anonymous/no-`shiro.ini` is a *dev- | ||
| convenience*; the supported production posture requires Shiro **or** a | ||
| trusted isolated network. So reports against an internet-exposed anonymous | ||
| instance are `OUT-OF-MODEL: non-default-build`. Correct? (→ §5a, §3, §11a) |
| instance are `OUT-OF-MODEL: non-default-build`. Correct? (→ §5a, §3, §11a) | ||
| 2. **`notebook.public=true` default.** Proposed: public-by-default is intended | ||
| convenience; operators needing isolation set it false. A "any user can read | ||
| an empty-ACL note" report is by-design, not a bug. Correct? (→ §5a, §2) |
There was a problem hiding this comment.
Yes, it's correct. it's by-design.
| 3. **Impersonation off by default.** Proposed: without impersonation, all | ||
| interpreter code legitimately runs as the **server** OS user; this is the | ||
| documented default and not a vulnerability; multi-tenant OS isolation | ||
| requires enabling impersonation. Correct? (→ §3, §5a, §9, §11a) |
| requires enabling impersonation. Correct? (→ §3, §5a, §9, §11a) | ||
| 4. **Binding mode as boundary.** Proposed: shared/scoped/isolated are | ||
| stability/resource controls, **not** security sandboxes; we should state | ||
| that explicitly in §9. Agree? Which is the default? (→ §5a, §9) |
There was a problem hiding this comment.
The default is shared mode, and it's not a security sandbox.
| **Wave 2 — properties & enforcement:** | ||
| 5. **URL ACL default.** Are `/interpreter`, `/credential`, `/configurations` | ||
| open to any authenticated role unless `[urls]` restricts them, or is there a | ||
| built-in admin gate? (→ §5a, §8) |
There was a problem hiding this comment.
No built-in admin gate. It relies on shiro.ini.
| ops that check only client-side? (→ §6, §8) | ||
| 7. **Credential isolation.** Does the credential store guarantee one user | ||
| cannot read another user's injected credentials, including via a shared | ||
| interpreter process? (→ §8, §9) |
|
|
||
| **Wave 3 — surfaces & limits:** | ||
| 8. **First-class interpreters.** Which interpreters/modules are supported for | ||
| security purposes vs. community/unsupported (→ §2/§3 carve-out)? |
There was a problem hiding this comment.
All interpreters are supported for security purposes.
| security purposes vs. community/unsupported (→ §2/§3 carve-out)? | ||
| 9. **Resource limits.** Any limits on paragraph/result size, websocket rate, or | ||
| concurrent interpreter launches? Where's the line between in-model pre-auth | ||
| exhaustion and by-design expensive queries? (→ §6, §8) |
There was a problem hiding this comment.
VALID-HARDENING. Please surface. No rate limit or concurrent‑launch cap today. This is an area we'd genuinely like the scan to flag and recommend improvements for
| 10. **Web-UI hardening.** Does enabling `http_security_headers` give CSRF + XSS | ||
| + clickjacking coverage, or are those partly the operator's job? (→ §9) |
There was a problem hiding this comment.
VALID-HARDENING. Please surface. No CSP, and CSRF is Origin‑based only. We'd welcome concrete improvements from the scan
| + clickjacking coverage, or are those partly the operator's job? (→ §9) | ||
| 11. **Coexistence.** This is a new `THREAT_MODEL.md`; `SECURITY.md` (currently a | ||
| stub) should point at it as canonical, and the website security pages stay | ||
| the operator how-to. Agree? (→ meta) |
|
Friendly nudge — this threat-model / discoverability PR is approved and green, so it's ready to merge whenever the PMC has a moment. Merging completes the pre-flight discoverability step (AGENTS.md → SECURITY.md → THREAT_MODEL.md) so an automated security-scan agent can mechanically find the model. No rush — thanks! |
|
Merged into master (78255fd). |
|
@potiuk Thank you, Jarek. I merged this PR. |
This is a v0 draft proposal for the Zeppelin PMC to review — please correct, reject, or discuss as needed. The maintainer is the decision-maker; nothing here is a requirement. The threat model does not need to be "finished" for anything downstream — it just makes automated security review (and triage of inbound reports) far less noisy.
Context. The ASF Security team is preparing the project for an automated agentic security scan we're piloting. Those scans run against a threat model that tells the scanner what's in scope, what's by-design, and what counts as a real finding — without one, the output buries maintainers in noise. This PR proposes the discoverable model plus the wiring the scanner needs.
What's in this PR:
THREAT_MODEL.md(new) — a v0 security threat model written from Zeppelin's public docs + codebase, following the threat-model-producer rubric. Every claim carries a provenance tag: (documented) (from your docs/site) or (inferred) (our guess from code/docs, for you to confirm / correct / strike). Draft confidence ~18 documented / 24 inferred.SECURITY.md(was an empty file) — disclosure pointer + link to the threat model.AGENTS.md— a## Securitysection so theAGENTS.md → SECURITY.md → THREAT_MODEL.mdchain resolves for automated tooling. The existing developer guidance is unchanged.The framing to sanity-check first: Apache Zeppelin runs user notebook code by design, so RBAC (Shiro + notebook ACL + URL ACL + impersonation) is the boundary, not a sandbox — a
%shcommand from a run-capable user is the product working, not RCE. The model treats interpreter execution as in-scope only when it crosses an authn/authz or tenant boundary.What we'd need from the PMC:
VALID), or a dev-convenience operators are expected to change (OUT-OF-MODEL: non-default-build)? This reshapes the whole model.If you'd rather own the drafting yourselves, close the PR and we'll wait — entirely your call.