feat(df): add df builtin for disk space usage reporting#205
feat(df): add df builtin for disk space usage reporting#205
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6b2ec9136f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b5392e09b8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1a7d813ef6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fc77c89e8a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9d94c7f3d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 05b3504ff3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cc63258ac6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 755c2fe837
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
1 similar comment
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5b3d483519
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5b3d483519
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b5438ea4db
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3e8940e9e9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cbc8201121
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 890e69547a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Adds df as a sandboxed builtin (Linux + macOS; Windows returns "not supported"). Mount enumeration goes through a new internal package, builtins/internal/diskstats, which reads /proc/self/mountinfo on Linux (documented sandbox-bypass mirroring ss/ip route — the path is hardcoded and never derived from user input) and calls getfsstat(2) on macOS. Supported flags: -h, -H, -k, -P, -T, -i, -a, -l, -t TYPE (repeatable), -x TYPE (repeatable), --total, --no-sync, --help. The dangerous --sync flag (which would invoke sync(2) and mutate kernel buffer state) is unregistered and rejected by pflag as unknown. -B, --output, and positional FILE operands are deferred to a future version. Memory bounds: mount table capped at 100k entries, mountinfo line length capped at 1 MiB, scan total capped at 1M lines. Integer arithmetic uses paired right-shifts in percentUsed and saturating addition in totals to avoid overflow on extreme magnitudes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI fixes: - Fuzz: dfRunFuzz no longer routes through testutil.RunScriptCtx, which fatals on shell-parse errors. The fuzzer routinely mutates inputs into malformed shell syntax (unclosed quotes, etc.); we now treat parse failures as expected and skip the iteration. - Windows: pentest tests that exercised df's actual code path (TestDfPentestVeryLongTypeName, TestDfPentestManyTypeFilters, TestDfPentestTypeFilterEdgeValues, TestDfPentestNonUTF8FlagValue, TestDfPentestUnicodeNFD, TestDfPentestQuotedValues) now skip via requireSupported — df returns "not supported" on Windows, which made the asserted code==0 invariant unreachable. Codex feedback: - P1: Remove "overlay" from the Linux pseudo-FS table. Container hosts use overlay as the default root filesystem; classifying it as pseudo hid the real root from the default listing. - P2: An explicit -t TYPE filter now overrides the default pseudo-FS suppression. `df -t tmpfs` lists tmpfs mounts without requiring -a, matching GNU df. - P2: humanBytes rounds up via math.Ceil instead of fmt.Sprintf's round-to-nearest, matching GNU df's "never under-report" rule. Example: 1,576,960 bytes is now "1.6M" (was "1.5M"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Windows test failures: TestDfPentestNonUTF8FlagValue, TestDfPentestUnicodeNFD,
and TestDfPentestQuotedValues were missing requireSupported(t) so they ran on
Windows where df returns "not supported" (code 1), making the asserted code==0
unreachable. Add the skip alongside the seven other Windows-skipped tests.
Fuzz contract: the previous version asserted exit codes in {0, 1, 127} only,
which broke on legitimate runner behaviour like "df 0&" → code 2 (background
job). The fuzz target's real contract is "no panic and no hang inside df";
both are enforced by Go's testing framework and the helper's 5-second timeout
respectively. Drop the exit-code assertion entirely. Also stop fataling on
non-ExitStatus runner errors (glob expansion failures, "internal error" on
adversarial inputs) — those are runner behaviour, not df defects. Verified by
running the fuzzer locally for 20s (670k execs) without failure.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
13 scenario tests under tests/scenarios/cmd/df/ exercised df's actual code path and asserted exit code 0 with output rows. On Windows, df returns "not supported" with exit 1. The scenario framework has no platform-skip mechanism (per AGENTS.md note about ip route), so happy- path scenarios cannot be added for platform-restricted commands. The deleted scenarios were redundant with builtins/df/df_unix_test.go, which uses //go:build unix to exercise the same flag wiring with structural assertions on the live mount table — a richer check than the stdout_contains substring match the YAML scenarios provided. Retained 6 scenarios that are platform-agnostic (they short-circuit before diskstats.List): --help, extra operand, unknown flag, --sync rejection, -B/--block-size rejection, --output rejection. Deleted: basic/default_succeeds.yaml, flags/all.yaml, exclude_type.yaml, human_readable.yaml, inodes.yaml, k_is_default.yaml, local.yaml, no_sync.yaml, posix_format.yaml, print_type.yaml, si.yaml, total.yaml, type_filter.yaml. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four P2 items, all verified against gdf 9.10 locally:
1. tmpfs / devtmpfs are no longer in pseudoTypes. They report real
storage (/dev/shm, /run) and GNU lists nonzero tmpfs mounts in the
default output. Default df under-reported them previously.
2. df -t TYPE / -x TYPE now exit 1 with "no file systems processed" on
stderr when filtering leaves zero rows. GNU df returns 1 in this
case and scripts test the exit status to detect missing filesystem
types — silent exit-0 with empty body was a regression.
3. df -ih / df -iH now scales inode counts via humanBytes (e.g. 4.0G,
381K). Previously inode mode bypassed human formatting and always
printed raw integers, which broke a common usage even though both
-i and -h are documented as supported.
4. df -P with -h or -H now uses the "Size" column header instead of
"1024-blocks". GNU's -P -h emits "Size" — keeping the fixed-block
label under human-suffixed values would mislead parsers about units.
Updated tests:
- TestDfTypeFilter_NoMatches: now asserts exit 1 + stderr message.
- TestFormatCount: covers the -ih and -iH cases.
- TestBuildHeader: pins -P -h and -P -H to "Size".
- TestDfPentest{VeryLongTypeName, ManyTypeFilters, TypeFilterEdgeValues,
NonUTF8FlagValue, UnicodeNFD, QuotedValues}: relaxed to accept exit 0
or 1 (no-match values are common in pentest inputs).
- TestDfPentestTypeIncludeAndExcludeSameType: now asserts the new
exit-1 + "no file systems processed" contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-on test fixes after the round-2 GNU-compat changes (commit 7b62dfc): 1. TestParseMountInfo_HappyPath: asserted devtmpfs.Pseudo == true, but devtmpfs was just removed from pseudoTypes. Flip the assertion to match the new (correct) classification, and add an analogous check for the /run tmpfs entry to lock in tmpfs.Pseudo == false. 2. TestDfPentestAllFlagsAtOnce: asserted exit 0, but on Linux the -t apfs filter matches no rows so df now correctly emits "no file systems processed" and exits 1 (the new GNU-compat path). Relax to accept either 0 or 1; the pentest's contract is "stacking every flag does not crash", not "succeeds with this specific argv on every host". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ception Three Codex round-3 items: 1. P1 — pre-stat filter (real bug). statfs(2) on a stale NFS / CIFS mount can hang indefinitely and is not interrupted by ctx cancellation. Previously diskstats.List statfs'd every mount in /proc/self/mountinfo before df.go could apply -l / -x nfs filters, so `df -l` could hang on a dead remote even though the user had explicitly asked to skip remote mounts. Fix: diskstats.List now takes a FilterFunc parameter. df.go's makePreStatFilter encodes -t/-x/-a/-l in a closure that runs between mountinfo parsing and the statfs syscall on Linux. Darwin already uses MNT_NOWAIT so the filter is cosmetic there. 2. P1 — AllowedPaths and statfs (documentation). statfs returns metadata only (block/inode counts, fs type, block size); no file content is read. The mount-point paths are kernel-controlled, never user-derived. This is the same exception class as ss reading all sockets and ip route reading the full routing table — gating it on AllowedPaths would produce a misleading partial listing. Documented under "Security Design Decisions" in AGENTS.md so the trade-off is explicit. 3. P2 — dedup bind-mounts. GNU df hides mounts that share a Source with an already-emitted mount unless -a is given. On container hosts with overlay bind-mounts of /etc/hosts, /etc/hostname, /etc/resolv.conf, default df was printing three duplicate rows and --total was double-counting them. Fix: filterMounts now keeps a seen-Source set; only the first mount for each Source is emitted unless -a is set. Empty Source values (rare; some pseudo filesystems) are not collapsed onto each other. Tests: - New TestPreStatFilter_* coverage replaces TestFilterMounts_* for the type/pseudo/local logic that moved into makePreStatFilter. - New TestFilterMounts_DedupBySourceWithoutAll, _AllPreservesDuplicates, _EmptySourceNotDeduped lock in the dedup behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… promotion
Three Codex round-4 items, all real:
1. P1 — FUSE remote subtypes (hang protection for sshfs etc.). Linux
mountinfo reports FUSE mounts as fuse.<subtype>, e.g. fuse.sshfs
or fuse.smbnetfs. The bare prefix list ("sshfs", "smb", …) did not
match these strings, so a stale fuse.sshfs mount was still statfs'd
and could hang despite -l / -x. Added explicit fuse.<remote-backend>
entries: fuse.sshfs, fuse.smb, fuse.cifs, fuse.davfs, fuse.glusterfs,
fuse.cephfs, fuse.nfs, fuse.s3, fuse.rclone. TestIsRemoteType extended
with positive cases for each subtype and negative cases for the
local FUSE backends (gvfsd-fuse, portal, archivemount).
2. P1 — Document df bypass in README.md. The "Security Model" note
already covered ss and ip route but not df; operators reading the
README would miss that AllowedPaths cannot hide mount metadata
from df. Updated the bullet to list df alongside ss/ip route and
to spell out that Statfs(2) returns metadata only.
3. P2 — humanBytes promotes after rounding. humanBytes(1048575, 1024)
was returning "1024K" instead of "1.0M" because the suffix was
chosen before the ceiling. Refactored: one rounding pass with
granularity decided by pre-rounded magnitude (one decimal if
scaled < 10, integer otherwise), then a single promotion step when
the rounded value reaches base. df -h now matches gdf -h
byte-for-byte on my host (927G/512G/415G — previously 926G off-by-one).
Restored the 1<<20-1 → 1.0M test case I had dropped, plus 1<<30-1 →
1.0G, 1<<40-1 → 1.0T, 10485759 → 10M.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…split -t Three Codex round-5 items, all real: 1. P2 — pseudo + -l. Mount.Local was defined as "not pseudo and not remote", so pseudo mounts (proc, sysfs, cgroup2) had Local=false and were silently dropped by `df -al`. GNU df treats pseudo and remote as independent classifications: -l drops only remote filesystems, and pseudo passes when -a is set. Redefined Local := !isRemoteType(...) on both Linux and Darwin so pseudo mounts are local (they live in kernel memory, not on a remote server). New TestPreStatFilter_AllPlusLocalKeepsPseudo locks this in. Verified locally: `df -al` now lists more entries than `df`, pseudo mounts re-enabled by -a survive -l. 2. P2 — dedup by device + shortest mount point. The previous Source-string dedup was brittle: two unrelated overlay mounts sharing a literal source name were collapsed (the kataShared bug Codex flagged), and the chosen representative depended on input order rather than mount-point length. Added a DevID field to Mount (parsed from /proc/self/mountinfo field index 2 on Linux, formatted from Statfs_t.Fsid on Darwin) and rewrote filterMounts as a two-pass dedup: first pass picks the index of the shortest-mountpoint entry per DevID, second pass emits in original order. Distinct DevIDs are preserved even when Source matches. New tests: TestFilterMounts_DedupByDevicePicksShortestMountpoint (kataShared scenario verbatim) and TestFilterMounts_DistinctDeviceSameSourceNotDeduped. 3. P3 — drop comma-split in -t / -x. Verified empirically: `gdf -t apfs,ext4` exits 1 with "no file systems processed" — GNU treats the entire string as a single literal type. The previous code split on commas and matched either side, which broke scripts that used the no-match exit code as a presence test. Updated stringSet to store the verbatim argv values; removed strings.SplitSeq from the symbol allowlist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two Codex round-6 items, both real: 1. P1 — saturating block multiply. unix.Statfs_t.Blocks * bsize could wrap a uint64 if a buggy/malicious FUSE filesystem reports counts above MaxUint64/bsize, producing tiny/zero Total/Used/Free for that mount and corrupting the --total accumulation. Added a mulSat(a, b) helper to diskstats_unix.go that clamps to MaxUint64 on overflow. Linux and Darwin backends now use mulSat for Total, Free, and Used. Locked in by TestMulSat covering boundary, just-over-boundary, and the realistic FUSE-rogue (maxU * 4096) case. 2. P2 — last-flag-wins for -h / -H. The previous code unconditionally preferred -h, so `df -hH` (intended SI override) and shell aliases that append -H to a default got the wrong size column. pflag.Visit walks set flags in lexicographical order, not argv order, so it cannot be used to honor input order. Refactored to a custom unitFlag (a pflag.Value) where -h and -H share a single *unitMode target and each Set call overwrites it — the LAST one wins by construction. registerUnitFlag wraps fs.VarPF + NoOptDefVal="true" so the flags accept no argument, matching pflag's bool convention. Confirmed locally: `df -hH` prints SI (995G), `df -Hh` prints IEC (927G). New TestUnitFlag_LastFlagWins covers 10 argv interleavings including combined short flags (-hH, -Hh). Side effect of #2: dropped the unused `human`/`si` *bool fields from the flags struct and the resolveUnitMode helper; mode is now read directly from a *unitMode field. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two Codex round-7 items, both verified empirically against gdf 9.10: 1. P2 — -k participates in the unit-mode last-flag-wins group. Before, -k was a no-op bool flag that ran alongside -h / -H without ever updating the shared mode, so `df -h -k` left mode = unitsHuman1024 even though gdf -h -k prints "1K-blocks". Promoted -k into the same registerUnitFlag group with unitsK value: all three flags now share the *unitMode target and the LAST argv entry wins. Locked in: TestUnitFlag_LastFlagWins gains six new cases (-h -k, -H -k, -k -h, -k -H, -hk, -kh). Smoke verified: df -h -k now prints "1K-blocks", df -k -h prints "Size". 2. P2 — reject overlapping -t / -x. gdf -t apfs -x apfs exits 1 with "file system type 'apfs' both selected and excluded". The previous code silently let exclusion win. Added an overlappingType helper and an early-out check in the handler that emits the GNU-format error before any mount listing runs. Updated TestDfPentestTypeIncludeAndExcludeSameType to assert the new error; kept TestPreStatFilter_TypeExcludeWinsOverIncludeOnPseudo to lock in the filter's lower-level exclude-precedence behaviour in isolation. Added TestOverlappingType for the helper itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two Codex round-8 items, both verified empirically against gdf 9.10: 1. P2 — GNU compresses "Available" to "Avail" in human modes (`gdf -h /` → "Filesystem Size Used Avail Use% Mounted on"). My buildHeader unconditionally emitted "Available", so any bash-comparison scenario for `df -h` / `df -H` would diverge. buildHeader now switches to "Avail" when the unit mode is unitsHuman1024 or unitsHuman1000; fixed-block modes (default, -k, -P) keep the full "Available" string. 2. P2 — `gdf -iP` keeps "IUse%" as the percentage header, not "Capacity". Only the *block* POSIX format substitutes "Capacity" for "Use%". My code unconditionally replaced "IUse%" with "Capacity" when posix was set in inode mode, which diverged from GNU. Removed the conditional; the inode header is now always "IUse%" regardless of -P. Tests: TestBuildHeader expanded to pin both behaviours (Avail in -h / -H, IUse% preserved with -iP, IUse% not replaced by Capacity). Also strengthened TestGNUCompatHeaderHuman to explicitly check that "Avail" is present and "Available" is NOT. Smoke verified locally: df -h → Filesystem ... Size Used Avail Use% Mounted on df -iP → Filesystem ... Inodes IUsed IFree IUse% Mounted on df -P → Filesystem ... 1024-blocks Used Available Capacity Mounted on Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two Codex review-4207024894 items I missed earlier; verified empirically
against gdf 9.10:
1. P2 — Limit POSIX layout to fixed-block output. GNU df only uses the
strict POSIX single-space row layout when -P is the *sole*
format-affecting flag. Combinations like -PT, -Pi, -hP, -HP all
revert to the default aligned column layout, and human modes (-h /
-H) keep "Use%" instead of "Capacity" even with -P.
Two narrow fixes:
* buildHeader: capacity = "Capacity" only when posix && !human.
* writeOutput: pass posixLayout = posix && !withType && !inodeMode
&& !human to printRows. Single-space rows now apply only to
`df -P` (with optional -k/-a/-l/-t/-x).
Smoke verified: df -hP / df -HP → Use% + aligned; df -PT / df -Pi
→ aligned; df -P alone → single-space + Capacity.
2. P3 — Reject the nonstandard --kibibytes flag. GNU only documents
the short -k; there is no --kibibytes long form. Previously rshell
accepted both, which let scripts depend on rshell-only behavior.
Re-registered -k via fs.VarPF with an empty long name so only the
short form is recognised. df --kibibytes now exits 1 with
"unknown flag: --kibibytes", matching gdf.
Added "df --kibibytes" to the rejected-flag pentest list.
Tests: TestBuildHeader strengthened to assert -hP/-HP keep Use% and
do NOT contain Capacity, and that -PT does keep Capacity. Pentest's
TestDfPentestRejectedFlags now includes --kibibytes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The lock file is local Claude session state and shouldn't be tracked. Removing the accidental check-in from the previous commit and updating .gitignore to prevent recurrence. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two Codex items, both verified empirically: 1. P2 — `df -t TYPE` no longer overrides pseudo suppression. gdf 9.4 confirms: `df -t devfs` (devfs is pseudo on macOS) exits 1 with "no file systems processed". GNU treats -t and the pseudo filter as independent — only -a exposes pseudo filesystems. My round-2 fix (which made includeSet bypass the pseudo check) was over-corrected; -t tmpfs continues to work because tmpfs isn't in our pseudoTypes table (RAM-backed but real storage), not because -t skips the pseudo filter. Removed the `else if !all && m.Pseudo` branch from makePreStatFilter so the pseudo check fires regardless of includeSet. Replaced TestPreStatFilter_TypeIncludeOverridesPseudoSuppression with TestPreStatFilter_TypeIncludeRespectsPseudoSuppression covering both the no-`-a` (drops) and `-a -t proc` (lists) cases. 2. P2 — Fuzz runner missed AllowAllCommands. interp.New defaults to "no commands allowed" (per interp/api.go), so every fuzz input starting with "df" was rejected as "command not allowed" before df ever ran. The new CI fuzz job for builtins/df was effectively a no-op. Added `interpoption.AllowAllCommands().(interp.RunnerOption)` to dfRunFuzz's interp.New options (matches testutil.RunScriptCtx). Verified: `go test -fuzz=FuzzDfFlagCombinator -fuzztime=5s` now executes ~430k iterations vs ~0 effective coverage before. Documented the requirement inline so future readers don't strip it back out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex P2: pflag.PrintDefaults rendered the shorthand-only -k flag as "-k, -- use 1024-byte blocks (POSIX default)" — treating the empty long name as a literal "--" string. That advertised "--" as if it were a usable long option, which is wrong. Fix: mark the -k Flag.Hidden = true so PrintDefaults skips it, and append a manual " -k use 1024-byte blocks (POSIX default)" line in printHelp. Flag parsing is unchanged — -k still overrides earlier -h/-H per the unitFlag last-wins logic, and --kibibytes is still rejected as unknown. TestDfHelp now asserts the new format: stdout must contain `-k `, must NOT contain `-k, --`, and must NOT contain `--kibibytes`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex P2: gdf -P uses GNU's aligned column layout, not the strict
POSIX-spec single-space format. The GNU manual documents -P as
"one-line filesystem rows + POSIX header labels" — spacing stays the
same as the default format, only the column *labels* change
("1024-blocks", "Capacity"). Verified empirically: gdf -P / | od -c
shows ~5 spaces between Filesystem and 1024-blocks.
Removed the single-space branch in printRows entirely. -P now goes
through the same aligned-column path as every other format. The
posixLayout flag and its surrounding logic in writeOutput were dropped;
printRows lost a parameter.
Tests:
- TestDfPosix and TestGNUCompatHeaderPosix now assert header words
appear in order rather than byte-equality, since column widths now
adapt to the longest filesystem name.
- TestGNUCompatPosixSingleSpace renamed to TestGNUCompatPosixNoTabs
and rewritten — the no-tabs invariant is the only spacing claim
that's actually true of GNU's -P output.
- strings.Join removed from df's symbol allowlist (no longer used
after the single-space path went away).
Smoke verified: df -P now matches gdf -P byte-for-byte:
Filesystem 1024-blocks Used Available Capacity Mounted on
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex P2: when --total accumulates very large filesystems, totalU and
totalA can each saturate to MaxUint64. The previous percentUsed used
saturatingAdd to compute denom, which clamped to MaxUint64 and lost
the relative magnitudes — equal totals like (MaxU, MaxU) reported
"100%" instead of the true "50%".
Two-step scaling:
1. If used + available would wrap (used > ^uint64(0) - available),
halve both before summing. The percentage is invariant under
scaling both sides equally, so at most 1 bit of precision is
lost — far below the 1% rounding tolerance.
2. The existing inner loop still shifts used and denom together to
keep used*100 from overflowing.
TestPercentUsed gains two cases pinning the new behaviour:
{MaxU, MaxU} → "50%"
{MaxU, MaxU/2} → "67%"
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex P2: GNU df treats -h / -H / -k as no-arg flags. `gdf
--human-readable=false` errors with "option '--human-readable'
doesn't allow an argument". Previously rshell silently accepted any
value because unitFlag.Set ignored its argument.
unitFlag.Set now returns an error unless the argument equals the
NoOptDefVal sentinel ("true"). After the fix:
df --human-readable=false → exit 1 ("does not allow an argument")
df -h=false → exit 1
df --human-readable / -h → unchanged (bare flag, success)
df -hH (combined short) → unchanged (success, SI)
Documented limitation: pflag's Set receives the literal string "true"
for both the bare-flag sentinel and explicit `=true`, so the two
cases cannot be distinguished from inside Set. As a result,
`df --human-readable=true` is still silently accepted (GNU rejects
it). Working around this would require argv rescanning. The pentest
suite now covers every `=false` case; a comment notes the `=true`
limitation so future readers don't try to add coverage for it.
Added errors.New to the df symbol allowlist (used by the new
sentinel error).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex P2: GNU's lib/human.c distinguishes SI 'k' (1000) from kibi 'K' (1024). With -H / --si, GNU df emits "25k" / "1.5k" for sub-mega values, while my code shared the suffix table and printed "25K" / "1.5K" in both modes. Only the kilo position differs across modes; M/G/T/P/E stay uppercase in both. Fix: humanBytes selects its suffix table by base. base=1000 uses "kMGTPE"; base=1024 keeps "KMGTPE". Tests: - TestHumanBytes_1000 updated: 1000 → "1.0k", 1500 → "1.5k", new 25_000 → "25k" (Codex's specific scenario), plus 1M/1G/1T cases to confirm those stay uppercase. - TestFormatCount's SI assertion flipped to "1.0k". - IEC tests unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex P2: GNU df rejects every `--name=value` form for boolean flags (e.g. `gdf --portability=false` errors with "option '--portability' doesn't allow an argument"). pflag.BoolP silently accepts the value, which silently diverged from GNU. Generalised the explicit-value rejection from unit flags to plain booleans. New noArgBool (a pflag.Value that wraps *bool) returns an error from Set unless its argument equals the NoOptDefVal sentinel "true". registerNoArgBool installs it and returns *bool so callers keep the same access pattern as fs.Bool. Replaced every fs.Bool / fs.BoolP in makeFlags: --portability / -P --print-type / -T --inodes / -i --all / -a --local / -l --total --no-sync --help After the fix: df --portability=false / -P=false → exit 1 df --all=false → exit 1 df --total=false → exit 1 ... etc df -P / -a / -aTl --total → unchanged (success) Pentest list extended with =false for every boolean flag (long and short). Same documented `=true` limitation as the unit flags applies (pflag's Set can't distinguish bare-flag from explicit `=true` because both pass the literal string "true"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… flags Previously, flags marked NoArg via NoOptDefVal="true" silently accepted df --all=true, df --portability=true, df --human-readable=true and similar forms because pflag could not distinguish the sentinel default from a user-supplied =true. Switch the sentinel to a NUL byte (\x00). POSIX execve(2) refuses to pass NUL inside argv elements, so the sentinel is unforgeable from a real shell invocation. The Set methods on unitFlag and noArgBool now compare against this sentinel and return "flag does not allow an argument" for any other value, including =true and =false. Pentest list extended with =true and =false variants for every no-argument flag to lock in the rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Linux statfs(2), f_blocks / f_bfree / f_bavail are counted in f_frsize units (the fragment size), while f_bsize is only the optimal transfer block size. They are usually equal, but can differ on FUSE-backed mounts — multiplying block counts by f_bsize there would scale the reported Size / Used / Available columns and the --total row incorrectly by that ratio. Match GNU coreutils df: prefer st.Frsize when non-zero, fall back to st.Bsize otherwise. Darwin's f_bsize is the fundamental block size (no f_frsize on macOS), so the macOS path is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GNU df does not sort its output by mountpoint — it walks the mount table as the kernel returns it (mountinfo order on Linux, getfsstat order on macOS). The previous alphabetical sort meant /dev appeared before /proc on Linux and /dev before /System/Volumes/* on macOS, breaking row-order expectations for scripts that diff against /usr/bin/df. Verified with /opt/homebrew/bin/gdf -a: kernel order is preserved. Drop the sort.Slice call and remove sort.Slice from both the per-command and global allowlists (no other builtin uses it). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The no-argument flags (-h/-H/--all/etc.) use a NUL byte as their NoOptDefVal sentinel so pflag can distinguish bare invocation from explicit --flag=value (the latter is rejected to match GNU df). But pflag.PrintDefaults rendered that sentinel verbatim into the help text as `--all[= ]\x00include pseudo…`, producing binary garbage in the df --help / help df output stream. Walk the flag list before PrintDefaults and zero out NoOptDefVal on any flag whose default is the sentinel. Parse has already run by then, so this only changes rendered output, not parsing semantics. Add a regression assert that the help stream contains no NUL bytes and no "[= ]" residue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GNU coreutils' me_remote (lib/mountlist.c) treats a mount as remote
if EITHER the source carries a network signature OR the type is in
the explicit remote-type list. Previously diskstats only checked the
type, so a remote mount surfaced under a generic type ("auto", "gpfs",
"acfs", or NFS-flavored types not in our prefix list with a
"server:/export" source) was classified as Local.
Under `df -l` the mount would then survive the pre-stat filter and
listImpl would still call statfs(2) on it — defeating the documented
hang protection on stale network mounts.
Replace isRemoteType(fsType) with isRemoteSource(source, fsType) that:
- flags any source starting with "//" → SMB / CIFS UNC
- flags any source containing ":" → NFS / sshfs host:/export
- falls back to the existing remote-type prefix list
Add table-driven tests for both classifications, allow strings.Contains
in the diskstats internal allowlist (with documenting comment), and
update the existing TestIsRemoteType to its new TestIsRemoteSource
counterpart.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GNU coreutils df keeps a per-column minimum width (lib/df.c field_data: SOURCE=14, FSTYPE=4, SIZE/USED/AVAIL=5, USE%=4) so the layout stays recognisable even when every value is short. On hosts where all filesystem names fit in fewer than 14 chars (typical containers with sources like /dev/vda, tmpfs, shm) we were collapsing the source column to "Filesystem 1K-blocks ..." instead of GNU's padded "Filesystem 1K-blocks ...", breaking byte-for-byte parity. Seed the widths array with these GNU minimums before scanning rows. The only minimums that exceed their header label are SOURCE (14 vs "Filesystem"=10) and USED (5 vs "Used"=4); the other minimums are covered by the header string itself but kept here for fidelity. Verified locally that `df` now matches `/opt/homebrew/bin/gdf` byte- for-byte at the column-spacing layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GNU coreutils does not classify squashfs as a dummy/pseudo filesystem (see lib/mountlist.c ME_DUMMY_0). On a typical Ubuntu desktop the snap, AppImage, and live-image loop mounts use squashfs and report real on-disk usage; hiding them by default makes both \`df\` and \`df -t squashfs\` silently empty, breaking scripts and monitoring that expect those mounts to surface. Drop \"squashfs\" from pseudoTypes and document why in the comment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
db791ad to
18140fa
Compare
Summary
Adds
dfas a sandboxed builtin so AI-agent scripts can inspect mounted filesystem usage without invoking the hostdfbinary. v1 supports Linux + macOS; Windows returns "not supported" (mirroringuname).builtins/internal/diskstats, that reads/proc/self/mountinfoon Linux and callsgetfsstat(2)on macOS. The/procread is exempt fromAllowedPathsfor the same reasonssandip routeare — the path is hardcoded, never derived from user input.--syncflag (which would invokesync(2)and mutate kernel buffer state) is unregistered and rejected by pflag as unknown. GTFOBins has nodfentry.Flag set
Implemented:
-h,-H,-k,-P,-T,-i,-a,-l,-t TYPE(repeatable),-x TYPE(repeatable),--total,--no-sync,--help.Deferred to v2:
[FILE]...operands,-B/--block-size,--output[=FIELDS].Rejected (unknown to pflag):
--sync,-v,--version.Safety bounds (per
docs/RULES.md)ErrMaxMountsreturned when truncated; both Linux and Darwin honour this for parity)./proc/self/mountinfoline cap of 1 MiB (errLineTooLong).percentUseduses paired right-shifts to avoidused*100overflow at extreme magnitudes.saturatingAddis used for grand-total accumulation so a rogue mount cannot wrap the running totals.formatCountuses floor + remainder bump to avoid wraparound when grand totals saturate toMaxUint64.ctx.Err()is checked at the top of every per-mount loop on both backends.Test plan
go test ./...(full suite passes locally on darwin/arm64).make fmtclean.go vet ./...cross-platform (GOOS=linux,GOOS=windows) clean.MaxMountstruncation, line-too-long, context cancellation.%,--totalrow equals saturated sum of per-mount columns,-tfilter keeps only matching rows,-xfilter removes them.-h,-i,-T,--total).-tflags, weird type values (empty/whitespace/comma/UTF-8/non-UTF-8/Unicode NFD), shell-metacharacter values, attempted config overrides (--proc-path,--mountinfo,--root,--prefix).FuzzParseMountInfo,FuzzUnescapeMountField,FuzzDfFlagCombinatorwith seed corpora drawn from implementation edge cases, CVE-class inputs (NUL bytes, CRLF, invalid UTF-8, ELF/PE/ZIP magic prefixes), and replays of every existing test input. Added to.github/workflows/fuzz.ymlso they run in CI.helplistsdfwith the right description;help dfshows full usage.🤖 Generated with Claude Code