Skip to content

feat(CLI): add --cache flag to phrase pull for conditional requests#1066

Open
Thibaut Etienne (tetienne) wants to merge 1 commit intophrase:mainfrom
tetienne:feat/pull-cache-etag
Open

feat(CLI): add --cache flag to phrase pull for conditional requests#1066
Thibaut Etienne (tetienne) wants to merge 1 commit intophrase:mainfrom
tetienne:feat/pull-cache-etag

Conversation

@tetienne
Copy link

@tetienne Thibaut Etienne (tetienne) commented Mar 19, 2026

We rely heavily on phrase pull in our CI pipelines and have been frustrated by the slowness of repeated pulls. Every invocation downloads all locale files regardless of whether translations have changed, which is wasteful and frequently pushes us into rate limiting. This PR adds HTTP conditional request support to avoid re-downloading unchanged locales.

Summary

  • Add --cache flag to phrase pull that stores ETag/Last-Modified response headers locally and sends them as If-None-Match/If-Modified-Since on subsequent pulls
  • When the server returns 304 Not Modified, the locale file download is skipped entirely
  • Cache is stored at os.UserCacheDir()/phrase/download_cache.json (XDG-compliant)
  • Only supported in sync mode; --cache with --async prints a warning and ignores caching

Fixes phrase/phrase-cli#164

Usage

# First run: downloads all locales, stores ETags
phrase pull --cache

# Second run: sends ETags, skips unchanged locales (304)
phrase pull --cache

# Without flag: downloads everything as before (no regression)
phrase pull

Real-world results

Tested against a production project with 379 locale files across 3 Phrase projects:

First run (populates cache, downloads everything):

$ phrase pull --cache
Downloaded en to locales/en/tradfile.json
Downloaded fr to locales/fr/tradfile.json
Downloaded de to locales/de/tradfile.json
...
Downloaded en to locales/en/validation.json
Downloaded fr to i18n_locales/fr/tables.json
...
(379 files downloaded in ~39s)

Second run (sends ETags, all 304 Not Modified):

$ phrase pull --cache
Not modified locales/en/tradfile.json
Not modified locales/fr/tradfile.json
Not modified locales/de/tradfile.json
...
Not modified locales/en/validation.json
Not modified i18n_locales/fr/tables.json
...
(379 files skipped in ~33s)

Cache file (~/Library/Caches/phrase/download_cache.json):

{
  "version": 1,
  "entries": {
    "a1b2c3d4e5f6...": {
      "etag": "W/\"070896511a30e4295c7cd1825d0f49ee\"",
      "last_modified": "Wed, 18 Mar 2026 15:04:42 GMT"
    }
  }
}
Mode Time Files Bandwidth
phrase pull 39s 379 downloaded Full
phrase pull --cache 33s 379 x 304 Zero

Wall-clock improvement is ~15%, but the main benefits are zero bandwidth on cache hit, reduced rate limit pressure, and lower server load. The bottleneck is 379 sequential HTTP round-trips; parallelizing downloads (out of scope) would amplify the time savings further.

GitHub Actions example

The --cache flag pairs well with actions/cache to persist ETags between CI runs, significantly reducing both download time and API rate limit consumption:

name: Pull translations

on:
  schedule:
    - cron: '0 */6 * * *'  # every 6 hours
  workflow_dispatch:

jobs:
  pull-translations:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Phrase CLI
        run: |
          curl -fsSL https://raw.githubusercontent.com/phrase/phrase-cli/master/install.sh | bash

      - name: Restore Phrase cache
        uses: actions/cache@v4
        with:
          path: ~/.cache/phrase
          key: phrase-etags-${{ hashFiles('.phrase.yml') }}
          restore-keys: |
            phrase-etags-

      - name: Pull translations
        env:
          PHRASEAPP_ACCESS_TOKEN: ${{ secrets.PHRASE_ACCESS_TOKEN }}
        run: phrase pull --cache

      - name: Check for changes
        id: diff
        run: |
          if git diff --quiet; then
            echo "changed=false" >> "$GITHUB_OUTPUT"
          else
            echo "changed=true" >> "$GITHUB_OUTPUT"
          fi

      - name: Create PR if translations changed
        if: steps.diff.outputs.changed == 'true'
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          git checkout -b chore/update-translations
          git add -A
          git commit -m "chore: update translations from Phrase"
          git push -u origin chore/update-translations
          gh pr create --title "chore: update translations" --body "Automated translation pull from Phrase"

Design decisions

  • ETag over updated_since: The SDK's UpdatedSince parameter produces partial files (only keys updated after that date), not "skip if unchanged" semantics. ETags are the correct HTTP mechanism for conditional full-file downloads.
  • Cache key uses reflection: Serializes the full LocaleDownloadOpts struct via reflection to extract set optional.* values. This ensures new SDK fields automatically affect the key without code changes.
  • Dirty flag on cache: Save() short-circuits when no entries were modified, avoiding unnecessary writes on all-304 runs.
  • Sentinel error for 304: The SDK treats >= 300 as an error. We intercept 304 before checking err and return errNotModified so the caller can print "Not modified" instead of "Downloaded".
  • Rate-limit retry strips conditional headers: After waiting for rate limit reset, retry without If-None-Match/If-Modified-Since to ensure a full response.

Test plan

  • go build ./... compiles (pre-existing cmd/ error unrelated)
  • go test ./cmd/internal/... passes (12 tests)
  • go vet ./cmd/internal/ clean
  • Manual test with real Phrase project (379 files): first pull downloads, second pull 304s
  • phrase pull without --cache behaves identically to before
  • phrase pull --cache --async warns and ignores cache

Store ETag and Last-Modified response headers locally and send them
as If-None-Match / If-Modified-Since on subsequent pulls. When the
server returns 304 Not Modified, the locale file is skipped.

Cache is stored at os.UserCacheDir()/phrase/download_cache.json.
Only supported in sync mode; --cache with --async prints a warning
and falls back to normal behavior.

Fixes phrase/phrase-cli#164
@tetienne Thibaut Etienne (tetienne) marked this pull request as ready for review March 19, 2026 13:26
@tetienne Thibaut Etienne (tetienne) changed the title feat(cli): add --cache flag to phrase pull for conditional requests feat(CLI): add --cache flag to phrase pull for conditional requests Mar 19, 2026
@forelabs
Copy link
Member

Thank you Thibaut Etienne (@tetienne) for the suggested improvement. The team will review this PR within the next business days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Caching support in Phrase CLI

2 participants