Skip to content

fix(build-push-docker-manifest): idempotent manifest + cosign reruns (RANE-4683)#1577

Draft
HashWrangler wants to merge 8 commits into
mainfrom
fix/cosign-verify-retry
Draft

fix(build-push-docker-manifest): idempotent manifest + cosign reruns (RANE-4683)#1577
HashWrangler wants to merge 8 commits into
mainfrom
fix/cosign-verify-retry

Conversation

@HashWrangler

@HashWrangler HashWrangler commented Jun 17, 2026

Copy link
Copy Markdown

Summary

Single holistic hardening pass for build-publish manifest job flakes and non-idempotent reruns (RANE-4683).

Layer Problem Fix
Manifest create Rerun calls imagetools create again → new index digest (digest drift) Skip create when tag already points at the same platform digests; fail if tag exists with different digests
Cosign sign Rerun re-signs unnecessarily Skip sign when cosign verify already succeeds
Cosign verify First-run flake: no signatures found right after sign 5×10s retry loop (Sigstore propagation lag)

Parent epic: RANE-4695 private release improvements.

Why one PR

All three changes live in build-push-docker-manifest and address the same operational failure mode: manifest jobs that fail intermittently and only pass on rerun. Splitting would ship partial behavior (e.g. verify retry without idempotent create still drifts digest on rerun).

Evidence

  • v2.43.0 / SECHD-30708 — non-idempotent reruns
  • chainlink + canary v2.990.5-beta.0 — cosign verify flake; rerun passed
  • 150-run scan — rare manifest failures (~1–3%), often recoverable on rerun

Out of scope (separate PR)

  • chainlink#22877 — Slack only posts when both core + ccip manifests succeed (RANE-4695 UX guard)

Test plan

  • Changesets publish minor build-push-docker-manifest release
  • Confirm reusable-docker-build-publish @build-push-docker-manifest/v1 picks up release
  • Tag build on chainlink/canary: first run green
  • Re-run failed manifest job only: digest unchanged, sign skipped, verify passes
  • Simulate digest mismatch on existing tag → job fails with explicit error

@HashWrangler HashWrangler requested a review from a team as a code owner June 17, 2026 16:50
@github-actions

Copy link
Copy Markdown
Contributor

👋 HashWrangler, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@HashWrangler HashWrangler marked this pull request as draft June 17, 2026 16:50
@HashWrangler HashWrangler changed the title fix(build-push-docker-manifest): retry cosign verify after sign fix(build-push-docker-manifest): idempotent manifest + cosign reruns (RANE-4683) Jun 17, 2026
…(RANE-4683)

- Skip imagetools create when the tag already references the expected
  platform digests; fail if the tag exists with different digests
- Skip cosign sign when verify already succeeds (idempotent rerun)
- Retry cosign verify after sign for Sigstore propagation flakes

Together these changes make build-publish manifest jobs safe to rerun without
digest drift or redundant signing, while still failing loudly on real conflicts.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the build-push-docker-manifest composite action to make build-publish reruns idempotent and reduce flakiness around Cosign verification/signature propagation.

Changes:

  • Adds an idempotency guard for manifest creation by comparing expected vs existing platform digests and skipping imagetools create when they match.
  • Skips cosign sign when an existing valid signature is already present, and adds retry logic to cosign verify to absorb Sigstore propagation lag.
  • Adds a changeset to release these hardening updates as a minor version bump.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
actions/build-push-docker-manifest/action.yml Adds digest comparison to avoid manifest digest drift on reruns; makes cosign signing/verification rerun-safe with skip + retry behavior.
.changeset/rane-4683-manifest-cosign-hardening.md Declares a minor release for the manifest + cosign hardening changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread actions/build-push-docker-manifest/action.yml
Comment thread actions/build-push-docker-manifest/action.yml
Comment thread actions/build-push-docker-manifest/action.yml
Comment thread actions/build-push-docker-manifest/action.yml Outdated
- Ensure jq is installed before digest comparison
- Validate docker-image-name-digests is non-empty
- Only skip cosign sign when OIDC identity constraints verify
- Fail fast on unexpected imagetools inspect errors (not just missing tag)

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread actions/build-push-docker-manifest/action.yml Outdated
Comment thread actions/build-push-docker-manifest/action.yml Outdated
Keep MANIFEST_CREATE_SKIPPED expression on one line and only sleep between
cosign verify retries, not after the final failed attempt.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread actions/build-push-docker-manifest/action.yml Outdated
Comment thread actions/build-push-docker-manifest/action.yml Outdated
- Fail fast when jq is missing and apt-get/sudo are unavailable
- Shorten step/output ids so summary env expressions stay on one line
  under Prettier (avoids split ${{ }} expressions)

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread actions/build-push-docker-manifest/action.yml Outdated
Comment thread actions/build-push-docker-manifest/action.yml Outdated
…empotency

Treat digest lists as sets: sort -u in normalization and existing-manifest
inspection, and reuse normalized digests when building imagetools create args.
- Only sleep between manifest digest inspect retries, not after the final fail
- Fail fast when an existing tag has no extractable platform digests instead
  of falling through to imagetools create (digest drift on rerun)

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread actions/build-push-docker-manifest/action.yml
Reject non-sha256 tokens in normalize_digest_csv so bad
docker-image-name-digests input fails fast with a clear error instead
of deferring to imagetools create.
@HashWrangler HashWrangler requested a review from Copilot June 17, 2026 17:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

Comment thread actions/build-push-docker-manifest/action.yml Outdated

@erikburt erikburt left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if its better if we don't verify the signature in this action, but rather verify it in the resuable workflow? Thoughts? @chainchad

…ty guard

jq is preinstalled on GitHub-hosted ubuntu-24.04 runners; fail fast with
a clear error when missing instead of apt-get/sudo install logic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants