Skip to content

feat(gallery): verify backend OCI images with keyless cosign#9823

Open
richiejp wants to merge 2 commits into
masterfrom
chore/sign-backend-images
Open

feat(gallery): verify backend OCI images with keyless cosign#9823
richiejp wants to merge 2 commits into
masterfrom
chore/sign-backend-images

Conversation

@richiejp
Copy link
Copy Markdown
Collaborator

Unlike the models we don't verify the backend contents. If we just
tried hashing them like the models, then we'd have to keep updating the
gallery YAML. So this uses signing instead.

However this does not turn on signing now, we first need to populate the
registry with signed images and then enforce it in a later release.

Notes for Reviewers

Close a trust gap where a registry compromise or MITM could silently
replace a backend image: the gallery YAML tells LocalAI which image to
pull, but until now nothing verified the bytes came from our CI.

Consumer (pkg/oci/cosignverify):

  • New package using sigstore-go to verify keyless-cosign signatures.
  • OCI 1.1 referrers API + new bundle format (no legacy :tag.sig).
  • Policy fields: Issuer / IssuerRegex / Identity / IdentityRegex /
    NotBefore. NotBefore is the revocation lever — keyless Fulcio certs
    are ephemeral so revocation is policy-side; advancing not_before in
    the gallery YAML invalidates every signature predating the cutoff.
  • TUF trusted root cached process-wide so N backends from one gallery
    do 1 fetch, not N.

Plumbing:

  • pkg/downloader: ImageVerifier interface + WithImageVerifier option
    threaded through DownloadFileWithContext. Verification runs between
    oci.GetImage and oci.ExtractOCIImage, with digest pinning via
    pinnedImageRef to close the TOCTOU window. Skips the verifier's HEAD
    when the ref is already digest-pinned.
  • core/config: Gallery.Verification YAML block.
  • core/gallery: backendDownloadOptions builds the verifier from the
    policy; applied on initial URI, mirrors, and tag fallbacks.
  • core/gallery/upgrade: the upgrade path now routes through the same
    options builder. A regression Ginkgo spec pins this contract —
    without it, UpgradeBackend silently bypassed verification.
  • core/cli: --require-backend-integrity (LOCALAI_REQUIRE_BACKEND_INTEGRITY)
    escalates missing policy / empty SHA256 from warn to hard-fail.

Producer (.github/workflows/backend_merge.yml):

  • id-token: write at job scope (PR-fork-safe via existing event gate).
  • sigstore/cosign-installer@v3 pinned to v2.4.1.
  • After each docker buildx imagetools create, resolve the manifest
    list digest and run cosign sign --recursive --new-bundle-format
    --registry-referrers-mode=oci-1-1 against repo@digest. --recursive
    signs the index and every per-arch entry, matching how the consumer
    resolves a tag to a platform-specific manifest before verifying.

Rollout: backend/index.yaml has no verification: block yet, so this
PR is backward-compatible — installs proceed with a warning until the
gallery is populated. Strict mode is opt-in.

Signed commits

  • Yes, I signed my commits.

Close a trust gap where a registry compromise or MITM could silently
replace a backend image: the gallery YAML tells LocalAI which image to
pull, but until now nothing verified the bytes came from our CI.

Consumer (pkg/oci/cosignverify):
- New package using sigstore-go to verify keyless-cosign signatures.
- OCI 1.1 referrers API + new bundle format (no legacy :tag.sig).
- Policy fields: Issuer / IssuerRegex / Identity / IdentityRegex /
  NotBefore. NotBefore is the revocation lever — keyless Fulcio certs
  are ephemeral so revocation is policy-side; advancing not_before in
  the gallery YAML invalidates every signature predating the cutoff.
- TUF trusted root cached process-wide so N backends from one gallery
  do 1 fetch, not N.

Plumbing:
- pkg/downloader: ImageVerifier interface + WithImageVerifier option
  threaded through DownloadFileWithContext. Verification runs between
  oci.GetImage and oci.ExtractOCIImage, with digest pinning via
  pinnedImageRef to close the TOCTOU window. Skips the verifier's HEAD
  when the ref is already digest-pinned.
- core/config: Gallery.Verification YAML block.
- core/gallery: backendDownloadOptions builds the verifier from the
  policy; applied on initial URI, mirrors, and tag fallbacks.
- core/gallery/upgrade: the upgrade path now routes through the same
  options builder. A regression Ginkgo spec pins this contract —
  without it, UpgradeBackend silently bypassed verification.
- core/cli: --require-backend-integrity (LOCALAI_REQUIRE_BACKEND_INTEGRITY)
  escalates missing policy / empty SHA256 from warn to hard-fail.

Producer (.github/workflows/backend_merge.yml):
- id-token: write at job scope (PR-fork-safe via existing event gate).
- sigstore/cosign-installer@v3 pinned to v2.4.1.
- After each docker buildx imagetools create, resolve the manifest
  list digest and run cosign sign --recursive --new-bundle-format
  --registry-referrers-mode=oci-1-1 against repo@digest. --recursive
  signs the index and every per-arch entry, matching how the consumer
  resolves a tag to a platform-specific manifest before verifying.

Rollout: backend/index.yaml has no `verification:` block yet, so this
PR is backward-compatible — installs proceed with a warning until the
gallery is populated. Strict mode is opt-in.

Assisted-by: claude-code:claude-opus-4-7 [Bash] [Edit] [Read] [Write] [WebSearch] [WebFetch]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
@richiejp richiejp force-pushed the chore/sign-backend-images branch from cfbf190 to ef2d419 Compare May 14, 2026 20:19
Comment thread core/gallery/backends.go Outdated

// strictBackendIntegrity reports whether the runtime is configured to refuse
// backend installs that cannot be integrity-checked.
func strictBackendIntegrity() bool {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what am I missing here, why are we reading the environment variable here directly and not propagating it as an option?

Comment thread core/gallery/backends.go Outdated
switch {
case uri.LooksLikeOCI():
if !hasVerification {
if strictBackendIntegrity() {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this running in a separate process? if not we can avoid reading the env here

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, sorry, that's just slop. I've added a lint check that prevents env vars from being randomly accessed.

…ad of env

The previous implementation re-exported the --require-backend-integrity
CLI flag into LOCALAI_REQUIRE_BACKEND_INTEGRITY via os.Setenv, then
re-read it in core/gallery via os.Getenv. This leaked process state
into the gallery package and made the flag impossible to override
per-call or test without touching the env.

Add RequireBackendIntegrity to ApplicationConfig (with a matching
WithRequireBackendIntegrity AppOption) and thread the bool through
every install/upgrade path: InstallBackend, InstallBackendFromGallery,
UpgradeBackend, InstallModelFromGallery, InstallExternalBackend,
ApplyGalleryFromString/File, startup.InstallModels. Worker subcommands
gain the same env-bound flag on WorkerFlags so distributed-worker
installs honor it consistently with the worker daemon path.

Add a forbidigo lint rule against os.Getenv / os.LookupEnv / os.Environ
to keep the env-leak pattern from creeping back. Existing offenders
(p2p, config loaders, etc.) are baseline-grandfathered by the existing
new-from-merge-base: origin/master setting; targeted path exclusions
cover the legitimate cases — kong CLI entry points, backend
subprocesses, system capability probes, gRPC AUTH_TOKEN inheritance,
test gating env vars.

Assisted-by: claude-code:claude-opus-4-7
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants