feat: add --ignore-link flag to skip URLs in link validation#69
feat: add --ignore-link flag to skip URLs in link validation#69blva wants to merge 4 commits intoagent-ecosystem:mainfrom
Conversation
Adds an IgnorePatterns option to links.Options that skips any URL containing a matching substring (case-insensitive). Useful for private repositories or localhost URLs that are valid locally but unreachable from CI environments. skill-validator check --ignore-link=github.com/myorg,localhost . skill-validator validate links --ignore-link=github.com/myorg . Closes agent-ecosystem#68
- Update all CheckLinks and RunLinkChecks call sites to pass the new Options parameter - Add TestIsIgnored with 8 table-driven cases covering empty patterns, exact match, case-insensitive matching, partial substrings, multi-pattern scenarios - Add TestCheckLinks_IgnorePatterns covering ignored URLs skipped entirely, non-ignored URLs still checked, mixed scenarios, and case-insensitive matching - Add TestRunLinkChecks_WithIgnorePatterns and TestRunAllChecks_WithIgnoreLinks in the orchestrate package to verify patterns propagate end-to-end
Replace the ad-hoc writeSkillFile helper with the existing invalid-skill fixture (which has httpstat.us/404 as its only HTTP link). Tests now follow the same pattern as the rest of the orchestrate package. Add --ignore-link integration tests to TestSliceFlags covering the repeated flag form, the comma-separated form, and the flag on the check subcommand, all consistent with how --only/--skip/--allow-dirs are tested.
dacharyc
left a comment
There was a problem hiding this comment.
Thanks for this, the implementation is clean and the test coverage is solid! A few things I'd like to address before merge:
Bug: empty pattern matches everything
In Go, strings.Contains(s, "") returns true for any s. This means an accidental empty pattern silently disables all link validation:
# All of these skip every link with no warning:
skill-validator validate links --ignore-link= .
skill-validator validate links --ignore-link=,github.com/myorg .
skill-validator validate links --ignore-link="github.com/myorg," .isIgnored should skip empty strings in the patterns slice. Would also be good to add a test case for this.
Visibility into skipped links
Right now ignored links are silently dropped. If someone misconfigures a pattern that's too broad (e.g. --ignore-link=github.com instead of --ignore-link=github.com/myorg), they'd have no way to know they're accidentally skipping legitimate links.
We do something similar for 403 responses at links/check.go:122, where we emit an info-level result rather than silently swallowing the ambiguity. Could we do the same here? Something like an rctx.Infof("%s (skipped — matches ignore pattern)", url) per link, or a summary count. Either way, the user should see what was skipped.
Substring matching breadth
Minor, but worth noting: --ignore-link=example.com would also match notexample.com or example.com.evil.org. For the motivating use cases (org paths, localhost) this is unlikely to matter, but could we note this in the flag help text or README? Something like "the pattern is matched as a substring, not a domain boundary" so users aren't surprised.
Closes #68
Changes
Adds an
--ignore-linkflag tocheckandvalidate linksthat skips any URL containing a matching substring (case-insensitive). The flag is repeatable and also accepts comma-separated patterns, consistent with--only,--skip, and--allow-dirs.Implementation
links.Options— new struct withIgnorePatterns []stringlinks.CheckLinks— updated signature to acceptOptions; filters ignored URLs before the concurrent check looplinks.isIgnored— case-insensitive substring helperorchestrate.Options— addsLinksOpts links.Options, threaded throughRunAllChecksandRunLinkCheckscmd/check.goandcmd/validate_links.go— expose--ignore-linkflag, wire into optionsREADME.md— documents the flag in both thevalidate linksandcheckcommand sections, and in the "Link validation" what-it-checks sectionTests
links.TestIsIgnored— 8 table-driven cases: empty patterns, exact match, case-insensitive, partial substring, no-match, first/second of multiple patterns, none matchinglinks.TestCheckLinks_IgnorePatterns— 4 cases usinghttptest.NewServer: ignored URL skipped, non-ignored URL still checked, mixed body, case-insensitive matchingorchestrate.TestRunLinkChecks_WithIgnorePatterns— uses theinvalid-skillfixture (hashttps://httpstat.us/404); asserts 0 errors when ignored, errors when notorchestrate.TestRunAllChecks_WithIgnoreLinks— same fixture; assertsLinksOptspropagates throughRunAllCheckscmd.TestSliceFlags— 3 integration tests exercising the repeated flag form, comma-separated form, and the flag oncheck --only=links; all run against the compiled binary