Skip to content

Fix PermissionError in _clear_all_cache_files on Windows (separate_files=True)#348

Merged
shaypal5 merged 4 commits into
masterfrom
copilot/fix-test-bad-cache-file-failure
Feb 28, 2026
Merged

Fix PermissionError in _clear_all_cache_files on Windows (separate_files=True)#348
shaypal5 merged 4 commits into
masterfrom
copilot/fix-test-bad-cache-file-failure

Conversation

Copilot AI commented Feb 28, 2026

Copy link
Copy Markdown
Contributor
  • Add retry loop with backoff in _clear_all_cache_files() to handle Windows PermissionError
  • Add test_clear_all_cache_files_retries_on_permission_error and test_clear_all_cache_files_raises_on_persistent_permission_error for 100% patch coverage
  • Fix ruff-format pre-commit.ci failure (parenthesized with statement style)
  • Add # pragma: no branch to inner retry for loop to suppress the structurally unreachable "natural exhaustion" arc (152->146) and reach 100% codecov/patch
Original prompt

This section details on the original issue you should resolve

<issue_title>test_bad_cache_file[True] fails on Python 3.10 Windows</issue_title>
<issue_description>> @copilot analyze, explain the cause and suggest a plan to fix the failing test test_pickle_core which fails for Python 3.10 on Windows - writing your insights and recommendations h...

Analysis of test_bad_cache_file[True] failure on Python 3.10 Windows

This is a pre-existing issue unrelated to the smoke-test changes (none of the files I modified touch tests/pickle_tests/test_pickle_core.py or src/cachier/cores/pickle.py).

Root cause

Where it fails: src/cachier/cores/pickle.py, line 148 — _clear_all_cache_files():

def _clear_all_cache_files(self) -> None:
    path, name = os.path.split(self.cache_fpath)
    for subpath in os.listdir(path):
        if subpath.startswith(f"{name}_"):
            os.remove(os.path.join(path, subpath))  # <-- PermissionError here on Windows

Error: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process

Why it happens on Windows only: Windows implements mandatory file locking (unlike Linux/macOS which use advisory locking). portalocker uses LockFileEx on Windows, which means while any thread holds a lock on a file, os.remove() on that file raises PermissionError.

The sequence that triggers it

  1. test_bad_cache_file[True] loops through sleep_times * 2, calling _helper_bad_cache_file() each iteration.
  2. Each call to _helper_bad_cache_file() starts two daemon threads and joins them with timeout=2.
  3. _bad_cache (the underlying function) sleeps for 1 second and internally holds a portalocker exclusive lock on the separate-files cache file while computing.
  4. If thread1.join(timeout=2) returns due to timeout before thread1 finishes, thread1 is still running and still holds the portalocker lock.
  5. The next iteration immediately calls _bad_cache_decorated.clear_cache()_clear_all_cache_files()os.remove(), which fails because thread1 from the previous iteration is still holding the lock.

Suggested fix plan

The fix belongs in _clear_all_cache_files() in src/cachier/cores/pickle.py. The os.remove() call should be made resilient to transient Windows file-lock errors by adding a small retry loop:

def _clear_all_cache_files(self) -> None:
    path, name = os.path.split(self.cache_fpath)
    for subpath in os.listdir(path):
        if subpath.startswith(f"{name}_"):
            fpath = os.path.join(path, subpath)
            for attempt in range(3):
                try:
                    os.remove(fpath)
                    break
                except PermissionError:
                    if attempt < 2:
                        time.sleep(0.1 * (attempt + 1))

This is the right layer to fix because clear_cache() is a public API that users also call, so it should be resilient to this Windows behaviour in production too — not just in tests.

Originally posted by @copilot in #345 (comment)
</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

…try loop

Co-authored-by: shaypal5 <917954+shaypal5@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a Windows-only PermissionError during clear_cache() for the pickle backend when separate_files=True, caused by mandatory file locking while other threads still hold portalocker locks.

Changes:

  • Add a small retry loop with incremental backoff around os.remove() in _PickleCore._clear_all_cache_files().
  • Re-raise PermissionError on the final attempt to avoid silently ignoring persistent failures.

@codecov

codecov Bot commented Feb 28, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (a599cf1) to head (7c818b1).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #348   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           12        12           
  Lines         1693      1701    +8     
  Branches       211       213    +2     
=========================================
+ Hits          1693      1701    +8     
Flag Coverage Δ
local 58.55% <100.00%> (+0.19%) ⬆️
mongodb 40.91% <11.11%> (-0.14%) ⬇️
postgres 43.85% <11.11%> (-0.15%) ⬇️
redis 47.14% <11.11%> (-0.17%) ⬇️
s3 42.85% <11.11%> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/cachier/cores/pickle.py 100.00% <100.00%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a599cf1...7c818b1. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@shaypal5

Copy link
Copy Markdown
Member

@copilot raise patch coverage to 100

…coverage

Co-authored-by: shaypal5 <917954+shaypal5@users.noreply.github.com>

Copilot AI commented Feb 28, 2026

Copy link
Copy Markdown
Contributor Author

@copilot raise patch coverage to 100

Done in 7157566. Added two tests:

  • test_clear_all_cache_files_retries_on_permission_error — covers the retry path where os.remove fails twice then succeeds, verifying time.sleep is called with 0.1s and 0.2s.
  • test_clear_all_cache_files_raises_on_persistent_permission_error — covers the re-raise path when all 3 attempts fail with PermissionError.

@shaypal5

Copy link
Copy Markdown
Member

@copilot fix pre-commit.ci errors:

fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
check json...........................................(no files to check)Skipped
check yaml...............................................................Passed
check toml...............................................................Passed
check docstring is first.................................................Passed
check that executables have shebangs.....................................Passed
check for case conflicts.................................................Passed
check for added large files..............................................Passed
detect private key.......................................................Passed
typos....................................................................Passed
docformatter.............................................................Passed
mdformat.................................................................Passed
prettier.................................................................Passed
Black by Ruff............................................................Failed
- hook id: ruff-format
- files were modified by this hook

warning: `incorrect-blank-line-before-class` (D203) and `no-blank-line-before-class` (D211) are incompatible. Ignoring `incorrect-blank-line-before-class`.
warning: `multi-line-summary-first-line` (D212) and `multi-line-summary-second-line` (D213) are incompatible. Ignoring `multi-line-summary-second-line`.
1 file reformatted, 63 files left unchanged

Ruff check...............................................................Passed
mypy.....................................................................Passed
pyproject-fmt............................................................Passed
Validate pyproject.toml..................................................Passed

Co-authored-by: shaypal5 <917954+shaypal5@users.noreply.github.com>

Copilot AI commented Feb 28, 2026

Copy link
Copy Markdown
Contributor Author

@copilot fix pre-commit.ci errors:

fix end of files.........................................................Passed

...

Fixed in cd1a9a9. Ran ruff format on the test file — it converted the chained with statements to the parenthesized form that the formatter expects.

@shaypal5

Copy link
Copy Markdown
Member

@copilot raise codecov/patch coverage to 100%.

Co-authored-by: shaypal5 <917954+shaypal5@users.noreply.github.com>

Copilot AI commented Feb 28, 2026

Copy link
Copy Markdown
Contributor Author

@copilot raise codecov/patch coverage to 100%.

Fixed in 7c818b1. The missing branch was 152->146 — the "inner for attempt loop naturally exhausted" arc — which is structurally unreachable because the loop always exits via break (success) or raise (persistent failure). Added # pragma: no branch to line 152 to tell coverage.py to exclude this unreachable arc, bringing codecov/patch to 100%.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • app.codecov.io
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node /home/REDACTED/work/_temp/ghcca-node/node/bin/node --enable-source-maps /home/REDACTED/work/_temp/copilot-developer-action-main/dist/index.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: pickle Pickle/file-system backend. area: tests Test suite, fixtures, coverage, or local test scripts. bug impact: compatibility Compatibility with Python, platforms, dependencies, or external APIs. outcome: merged Merged into the repository. size: small Small PR by changed-file count. source: copilot Opened by GitHub Copilot coding agent. type: bugfix Fixes a user-visible bug or regression. type: tests Tests, fixtures, test coverage, or test data.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test_bad_cache_file[True] fails on Python 3.10 Windows

4 participants