Skip to content

ci: pass changed files through env in release-notes reST check#11856

Open
camgrimsec wants to merge 1 commit into
deepset-ai:mainfrom
camgrimsec:fix-workflow-code-injection
Open

ci: pass changed files through env in release-notes reST check#11856
camgrimsec wants to merge 1 commit into
deepset-ai:mainfrom
camgrimsec:fix-workflow-code-injection

Conversation

@camgrimsec

Copy link
Copy Markdown
Contributor

The Check reStructuredText code formatting step in release_notes.yml uses shell: python and builds its file list by interpolating the tj-actions/changed-files output directly into a Python source literal:

files = "${{ steps.changed-files.outputs.all_changed_files }}".split()

At runtime the workflow expression is expanded by the Actions runner before Python starts, so the contents of that output become part of the Python source of the step. Because git allows filenames to contain a double-quote character as well as newlines and backslashes, a pull_request contributor can add a release-note file whose path breaks out of the string literal. GitHub renders the workflow using the head ref's copy of the file, so an attacker does not need write access to main to reach this code path.

The trigger is pull_request (not pull_request_target), so GITHUB_TOKEN is read-only for this repo and repository secrets are not attached to the job. The direct blast radius is limited to code execution in the release-notes job itself, but that job still runs on the deepset-owned runner pool and shares the ephemeral filesystem with the checked-out source tree, so injecting Python here is undesirable regardless.

The fix follows GitHub's documented secure workflows pattern: expose the changed-files output as an environment variable and read it from Python via os.environ, so the untrusted value never becomes part of the source string that the interpreter parses.

Same class of hardening as #11733 (persist-credentials: false) and #11777 (further dependency pinning + CodeQL), extended one step further into the workflow body.

Related Issues

  • fixes #issue-number

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

  • I have read the contributors guidelines and the code of conduct.
  • I have updated the related issue with new insights and changes.
  • I have added unit tests and updated the docstrings.
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I have documented my code.
  • I have added a release note file, following the contributors guidelines.
  • I have run pre-commit hooks and fixed any issue.

The Check reStructuredText code formatting step in release_notes.yml uses
shell: python and builds its file list by interpolating the
tj-actions/changed-files output directly into a Python source literal:

  files = "${{ steps.changed-files.outputs.all_changed_files }}".split()

At runtime the workflow expression is expanded by the Actions runner
before Python starts, so the contents of that output become part of the
Python source of the step. Because git allows filenames to contain a
double-quote character as well as newlines and backslashes, a
pull_request contributor can add a release-note file whose path breaks
out of the string literal. GitHub renders the workflow using the head
ref's copy of the file, so an attacker does not need write access to
main to reach this code path.

The trigger is pull_request (not pull_request_target), so GITHUB_TOKEN
is read-only for this repo and repository secrets are not attached to
the job. The direct blast radius is limited to code execution in the
release-notes job itself, but that job still runs on the deepset-owned
runner pool and shares the ephemeral filesystem with the checked-out
source tree, so injecting Python here is undesirable regardless.

The fix follows GitHub's documented secure workflows pattern: expose the
changed-files output as an environment variable and read it from Python
via os.environ, so the untrusted value never becomes part of the source
string that the interpreter parses.

Same class of hardening as deepset-ai#11733 (persist-credentials: false) and
deepset-ai#11777 (further dependency pinning + CodeQL), extended one step further
into the workflow body.
@camgrimsec camgrimsec requested a review from a team as a code owner July 3, 2026 04:08
@camgrimsec camgrimsec requested review from julian-risch and removed request for a team July 3, 2026 04:08
@vercel

vercel Bot commented Jul 3, 2026

Copy link
Copy Markdown

@camgrimsec is attempting to deploy a commit to the deepset Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant