Skip to content

Conversation

@sevbch
Copy link
Collaborator

@sevbch sevbch commented Apr 22, 2025

Context

When scanning a docker image, one could encounter this error when scanning layers files:

tarfile.ReadError: invalid header

What has been done

This happened because when dealing with layers files, code was implicitly assuming they were .tar files by using tarfile.TarFile.

Instead, we use tarfile.open which automatically detects the compression format (based on file extension, or in our case headers) and returns the relevant TarFile instance (matching the appropriate compression/decompression algorithm).

Validation

Added a non-regression test. It contains a layer with a .tar.gz file. Parsing failed before, doesn't fail anymore.

Note

In docker image manifest.json file, sometimes there is a mediaType field indicating the content type of each component. However this field isn't mandatory (it's part of the Docker Registry API V2 specs) so we can't rely on it to choose the relevant decompression method.

Also it's simpler to let tarfile module handle it for us.

PR check list

  • As much as possible, the changes include tests (unit and/or functional)
  • If the changes affect the end user (new feature, behavior change, bug fix) then the PR has a changelog entry (see doc/dev/getting-started.md). If the changes do not affect the end user, then the skip-changelog label has been added to the PR.

@sevbch sevbch marked this pull request as ready for review April 22, 2025 17:00
@sevbch sevbch requested a review from a team as a code owner April 22, 2025 17:00
@sevbch sevbch self-assigned this Apr 22, 2025
@codecov
Copy link

codecov bot commented Apr 22, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.88%. Comparing base (d68cb5c) to head (440d59f).
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1084   +/-   ##
=======================================
  Coverage   91.88%   91.88%           
=======================================
  Files         144      144           
  Lines        6066     6066           
=======================================
  Hits         5574     5574           
  Misses        492      492           
Flag Coverage Δ
unittests 91.88% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@agateau-gg agateau-gg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking on the changelog, but looks good, approved.

@sevbch sevbch force-pushed the severine/handle-tar-gz-in-docker-images branch from 10d62fc to 9b99f9c Compare April 24, 2025 11:41
@sevbch sevbch force-pushed the severine/handle-tar-gz-in-docker-images branch from 9b99f9c to 440d59f Compare April 24, 2025 11:42
@sevbch sevbch enabled auto-merge April 24, 2025 11:42
@sevbch sevbch merged commit c0b26d0 into main Apr 24, 2025
29 checks passed
@sevbch sevbch deleted the severine/handle-tar-gz-in-docker-images branch April 24, 2025 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants