Skip to content

Add security threat model and wire AGENTS.md -> SECURITY.md -> THREAT_MODEL.md#5268

Merged
jongyoul merged 1 commit into
apache:masterfrom
potiuk:asf-security/threat-model-2026-06-05
Jun 12, 2026
Merged

Add security threat model and wire AGENTS.md -> SECURITY.md -> THREAT_MODEL.md#5268
jongyoul merged 1 commit into
apache:masterfrom
potiuk:asf-security/threat-model-2026-06-05

Conversation

@potiuk

@potiuk potiuk commented Jun 5, 2026

Copy link
Copy Markdown
Member

This is a v0 draft proposal for the Zeppelin PMC to review — please correct, reject, or discuss as needed. The maintainer is the decision-maker; nothing here is a requirement. The threat model does not need to be "finished" for anything downstream — it just makes automated security review (and triage of inbound reports) far less noisy.

Context. The ASF Security team is preparing the project for an automated agentic security scan we're piloting. Those scans run against a threat model that tells the scanner what's in scope, what's by-design, and what counts as a real finding — without one, the output buries maintainers in noise. This PR proposes the discoverable model plus the wiring the scanner needs.

What's in this PR:

  • THREAT_MODEL.md (new) — a v0 security threat model written from Zeppelin's public docs + codebase, following the threat-model-producer rubric. Every claim carries a provenance tag: (documented) (from your docs/site) or (inferred) (our guess from code/docs, for you to confirm / correct / strike). Draft confidence ~18 documented / 24 inferred.
  • SECURITY.md (was an empty file) — disclosure pointer + link to the threat model.
  • AGENTS.md — a ## Security section so the AGENTS.md → SECURITY.md → THREAT_MODEL.md chain resolves for automated tooling. The existing developer guidance is unchanged.

The framing to sanity-check first: Apache Zeppelin runs user notebook code by design, so RBAC (Shiro + notebook ACL + URL ACL + impersonation) is the boundary, not a sandbox — a %sh command from a run-capable user is the product working, not RCE. The model treats interpreter execution as in-scope only when it crosses an authn/authz or tenant boundary.

What we'd need from the PMC:

  1. §14 wave 1 (the important one): rule on the insecure defaults — is anonymous-by-default / public-notebooks / impersonation-off the supported production posture (a report against it is VALID), or a dev-convenience operators are expected to change (OUT-OF-MODEL: non-default-build)? This reshapes the whole model.
  2. Walk the §14 questions (waves 1–3) — a one-line confirm / correct / strike per question is enough; each (inferred) tag becomes (maintainer) as you answer.

If you'd rather own the drafting yourselves, close the PR and we'll wait — entirely your call.

…_MODEL.md

Adds a v0 security threat model (THREAT_MODEL.md), fills the previously-empty
SECURITY.md with a disclosure pointer to it, and adds a Security section to
AGENTS.md so the AGENTS.md -> SECURITY.md -> THREAT_MODEL.md discoverability
chain resolves for automated tooling. The threat model is a provenance-tagged
v0 draft for the PMC to review (see the open questions in its section 14). No
existing developer guidance in AGENTS.md is changed.

Generated-by: Claude Code (Claude Opus 4.8)

@jongyoul jongyoul left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — the framing is right (RBAC is the boundary, not a sandbox; a %sh from a run-capable user is the product working, not RCE). Please keep it.

I left comments for the answer to §14.

Comment thread THREAT_MODEL.md
1. **Anonymous default.** Proposed: anonymous/no-`shiro.ini` is a *dev-
convenience*; the supported production posture requires Shiro **or** a
trusted isolated network. So reports against an internet-exposed anonymous
instance are `OUT-OF-MODEL: non-default-build`. Correct? (→ §5a, §3, §11a)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's correct.

Comment thread THREAT_MODEL.md
instance are `OUT-OF-MODEL: non-default-build`. Correct? (→ §5a, §3, §11a)
2. **`notebook.public=true` default.** Proposed: public-by-default is intended
convenience; operators needing isolation set it false. A "any user can read
an empty-ACL note" report is by-design, not a bug. Correct? (→ §5a, §2)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's correct. it's by-design.

Comment thread THREAT_MODEL.md
3. **Impersonation off by default.** Proposed: without impersonation, all
interpreter code legitimately runs as the **server** OS user; this is the
documented default and not a vulnerability; multi-tenant OS isolation
requires enabling impersonation. Correct? (→ §3, §5a, §9, §11a)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's correct.

Comment thread THREAT_MODEL.md
requires enabling impersonation. Correct? (→ §3, §5a, §9, §11a)
4. **Binding mode as boundary.** Proposed: shared/scoped/isolated are
stability/resource controls, **not** security sandboxes; we should state
that explicitly in §9. Agree? Which is the default? (→ §5a, §9)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is shared mode, and it's not a security sandbox.

Comment thread THREAT_MODEL.md
**Wave 2 — properties & enforcement:**
5. **URL ACL default.** Are `/interpreter`, `/credential`, `/configurations`
open to any authenticated role unless `[urls]` restricts them, or is there a
built-in admin gate? (→ §5a, §8)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No built-in admin gate. It relies on shiro.ini.

Comment thread THREAT_MODEL.md
ops that check only client-side? (→ §6, §8)
7. **Credential isolation.** Does the credential store guarantee one user
cannot read another user's injected credentials, including via a shared
interpreter process? (→ §8, §9)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does.

Comment thread THREAT_MODEL.md

**Wave 3 — surfaces & limits:**
8. **First-class interpreters.** Which interpreters/modules are supported for
security purposes vs. community/unsupported (→ §2/§3 carve-out)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All interpreters are supported for security purposes.

Comment thread THREAT_MODEL.md
security purposes vs. community/unsupported (→ §2/§3 carve-out)?
9. **Resource limits.** Any limits on paragraph/result size, websocket rate, or
concurrent interpreter launches? Where's the line between in-model pre-auth
exhaustion and by-design expensive queries? (→ §6, §8)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VALID-HARDENING. Please surface. No rate limit or concurrent‑launch cap today. This is an area we'd genuinely like the scan to flag and recommend improvements for

Comment thread THREAT_MODEL.md
Comment on lines +373 to +374
10. **Web-UI hardening.** Does enabling `http_security_headers` give CSRF + XSS
+ clickjacking coverage, or are those partly the operator's job? (→ §9)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VALID-HARDENING. Please surface. No CSP, and CSRF is Origin‑based only. We'd welcome concrete improvements from the scan

Comment thread THREAT_MODEL.md
+ clickjacking coverage, or are those partly the operator's job? (→ §9)
11. **Coexistence.** This is a new `THREAT_MODEL.md`; `SECURITY.md` (currently a
stub) should point at it as canonical, and the website security pages stay
the operator how-to. Agree? (→ meta)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@potiuk

potiuk commented Jun 12, 2026

Copy link
Copy Markdown
Member Author

Friendly nudge — this threat-model / discoverability PR is approved and green, so it's ready to merge whenever the PMC has a moment. Merging completes the pre-flight discoverability step (AGENTS.md → SECURITY.md → THREAT_MODEL.md) so an automated security-scan agent can mechanically find the model. No rush — thanks!

@jongyoul jongyoul merged commit 78255fd into apache:master Jun 12, 2026
19 checks passed
@jongyoul

Copy link
Copy Markdown
Member

Merged into master (78255fd).

@jongyoul

Copy link
Copy Markdown
Member

@potiuk Thank you, Jarek. I merged this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants