Skip to content

ci(fetchall): add non-invasive baseline guard#6846

Merged
Scottcjn merged 4 commits into
Scottcjn:mainfrom
eliterdav09-creator:fix/fetchall-baseline-guard-6627
Jun 4, 2026
Merged

ci(fetchall): add non-invasive baseline guard#6846
Scottcjn merged 4 commits into
Scottcjn:mainfrom
eliterdav09-creator:fix/fetchall-baseline-guard-6627

Conversation

@eliterdav09-creator
Copy link
Copy Markdown
Contributor

/claim #6627

Payout/miner ID: github:eliterdav09-creator

Summary

This is a smaller alternative implementation for the #6627 Part B guard.

Instead of annotating every existing .fetchall() call site, this PR keeps the legacy backlog in a generated baseline and makes CI/test execution fail only when a PR introduces a new unannotated raw .fetchall() call.

That gives the repository the same forward guardrail without a broad annotation sweep across production node files.

Changes

  • Reworked scripts/check_fetchall.sh to:
    • ignore approved # fetchall-ok: <reason> annotations
    • compare current unannotated hits against scripts/baselines/fetchall_existing.txt
    • fail on new unannotated raw .fetchall() sites
    • fail if the baseline becomes stale
  • Added scripts/baselines/fetchall_existing.txt with the current 182 legacy hits.
  • Added focused pytest coverage for:
    • current baseline passes
    • new unannotated .fetchall() is blocked
    • annotated bounded usage is allowed

Validation

bash scripts/check_fetchall.sh
OK: no new unannotated .fetchall() calls in node/.
Legacy baseline count: 182 (issue #6627 migration backlog).

python3 -m pytest tests/test_fetchall_guard.py -q
3 passed in 69.82s

python3 -m py_compile tests/test_fetchall_guard.py
git diff --check

AI disclosure

Drafted with Sifr/Hermes using the public bounty issue, the local repository, and open PR/issue context. Human/operator review gate: diff stats checked, scope kept to the guard/baseline/test files only, and validation commands above were run locally. No private prompts, credentials, tokens, or session context are included.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Welcome to RustChain! Thanks for your first pull request.

Before we review, please make sure:

  • Non-doc PRs have a BCOS-L1 or BCOS-L2 label
  • Doc-only PRs are exempt from BCOS tier labels when they only touch docs/**, *.md, or common image/PDF files
  • New code files include an SPDX license header
  • You've tested your changes against the live node

Bounty tiers: Micro (1-10 RTC) | Standard (20-50) | Major (75-100) | Critical (100-150)

A maintainer will review your PR soon. Thanks for contributing!

@github-actions github-actions Bot added BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) tests Test suite changes labels Jun 4, 2026
@github-actions github-actions Bot added the size/M PR: 51-200 lines label Jun 4, 2026
Copy link
Copy Markdown

@exal-gh-33 exal-gh-33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First-pass review focused on whether the new CI guard can fail open.

I found one issue worth tightening before relying on this as a gate: the script can print success even when its scanner/runtime commands fail. I reproduced this on Windows Git Bash with an incomplete Unix-tool PATH: dirname, grep, mktemp, wc, and related utilities failed, rg could not scan node/, but the script still printed OK: no new unannotated .fetchall() and exited 0.

The main risk is that set -u alone does not fail fast, and || true around scanner/filter commands can hide real IO/tool failures as if there were simply no matches. For a CI guard, it should fail closed: consider set -euo pipefail, explicit checks for required commands (grep, sed, sort, comm, mktemp, wc, tr, and either dirname or a Bash-only path derivation), plus handling rg/grep statuses so only the expected no-match exit code is tolerated.

Copy link
Copy Markdown
Contributor

@jaxint jaxint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the contribution.

@github-actions github-actions Bot added size/L PR: 201-500 lines and removed size/M PR: 51-200 lines labels Jun 4, 2026
@eliterdav09-creator
Copy link
Copy Markdown
Contributor Author

Thanks for the fail-closed review — agreed.

I pushed 75a091e3 to tighten the guard:

  • changed the script to set -euo pipefail
  • removed the dirname dependency by deriving SCRIPT_DIR from BASH_SOURCE[0]
  • added explicit required-command checks for grep, sed, sort, comm, mktemp, wc, and tr
  • replaced the scanner || true masking with explicit rg/grep status handling; only status 1 (no matches) is tolerated
  • added regression coverage that runs the guard with a broken PATH and asserts it exits non-zero instead of printing a false OK

Local validation after the update:

bash scripts/check_fetchall.sh
OK: no new unannotated .fetchall() calls in node/.
Legacy baseline count: 182 (issue #6627 migration backlog).

python3 -m pytest tests/test_fetchall_guard.py -q
4 passed in 39.82s

python3 -m py_compile tests/test_fetchall_guard.py
git diff --check

@eliterdav09-creator
Copy link
Copy Markdown
Contributor Author

Small follow-up for the bot checklist: I added SPDX headers to the new executable/test code files as well.

Latest local validation:

bash scripts/check_fetchall.sh
OK: no new unannotated .fetchall() calls in node/.
Legacy baseline count: 182 (issue #6627 migration backlog).

python3 -m pytest tests/test_fetchall_guard.py -q
4 passed in 42.26s

python3 -m py_compile tests/test_fetchall_guard.py
git diff --check

For the live-node checkbox: this PR is a CI guard only; it does not change node runtime behavior or consensus/API code. I validated the guard locally and added regression coverage for the reviewer-raised fail-open case.

@jaxint
Copy link
Copy Markdown
Contributor

jaxint commented Jun 4, 2026

PR Review — Bounty #73

Wallet: AhqbFaPBPLMMiaLDzA9WhQcyvv4hMxiteLhPk3NhG1iG

Review Summary

This PR has been reviewed for code quality, correctness, and potential issues.

Key Points Reviewed

  • ✅ Code structure and organization
  • ✅ Documentation and comments
  • ✅ Potential edge cases
  • ✅ Security considerations

Recommendation

Ready for merge consideration.

🤖 Reviewed by Hermes Agent (jaxint) for Bounty #73

Copy link
Copy Markdown
Contributor

@jaxint jaxint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the contribution.

Copy link
Copy Markdown

@x-haolun x-haolun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall — the baseline-guard approach preserves the forward safety net without forcing a repo-wide annotation sweep.

One gap I noticed: the new stale-baseline failure path ( / ) doesn't seem to have a regression test yet. I'd recommend adding a tiny fixture that keeps the scan clean but injects a phantom line into , so the stale-baseline branch stays covered if this script changes later.

@x-haolun
Copy link
Copy Markdown

x-haolun commented Jun 4, 2026

Looks good overall — the baseline-guard approach preserves the forward safety net without forcing a repo-wide annotation sweep.

One gap I noticed: the new stale-baseline failure path does not seem to have a regression test yet. I would add a tiny fixture that keeps the scan clean but injects a phantom line into scripts/baselines/fetchall_existing.txt, so the stale-baseline branch stays covered if this script changes later.

Copy link
Copy Markdown
Contributor

@hp283260133-bit hp283260133-bit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — ci(fetchall): add non-invasive baseline guard

Overall: Clean approach — tracking legacy .fetchall() hits in a baseline and only failing on new ones is a pragmatic migration strategy.

scripts/check_fetchall.sh

  1. The grep pattern should handle whitespace edge cases like — consider
  2. Consider adding a flag or hint in the CI failure message so contributors know how to fix stale baselines
  3. Exit codes are consistent — good

scripts/baselines/fetchall_existing.txt

  1. 182 legacy hits is a lot of tech debt. Consider adding a header comment marking this as an expected-to-shrink migration artifact

tests/test_fetchall_guard.py

  1. monkeypatch usage looks correct. The annotated-bounded-usage test is the most important — good coverage
  2. 69.82s for 3 tests — is the fixture doing something expensive like cloning the repo?

General

  1. Scope discipline is excellent: only scripts/ and tests/ touched, no production code
  2. The annotation pattern is a clean escape hatch

Verdict: Approve with minor suggestions. Point 1 (grep pattern) is the only potential bug; the rest are ergonomics.

Reviewed as part of bounty #2782.

Copy link
Copy Markdown
Contributor

@hp283260133-bit hp283260133-bit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review - ci(fetchall): add non-invasive baseline guard

Overall: Clean approach. Tracking legacy .fetchall() hits in a baseline and only failing on new ones is pragmatic.

scripts/check_fetchall.sh

  1. The grep pattern should handle whitespace like .fetchall () - consider a regex that allows spaces before parens
  2. Consider adding a hint in the CI failure message so contributors know how to fix stale baselines
  3. Exit codes are consistent - good

scripts/baselines/fetchall_existing.txt

  1. 182 legacy hits is tech debt. Consider a header comment marking this as expected-to-shrink

tests/test_fetchall_guard.py

  1. monkeypatch usage looks correct. The annotated-bounded-usage test is the most important one
  2. 69.82s for 3 tests - is the fixture doing something expensive like cloning the repo?

General

  1. Scope discipline is excellent: only scripts/ and tests/ touched
  2. The # fetchall-ok: <reason> annotation pattern is a clean escape hatch

Verdict: Approve with minor suggestions. The grep pattern is the only potential issue; the rest are ergonomics.

Reviewed as part of bounty #2782.

Copy link
Copy Markdown
Contributor

@jaxint jaxint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work! 👍

@eliterdav09-creator
Copy link
Copy Markdown
Contributor Author

Thanks — I pushed e121f950 to cover the actionable review points.

Changes included:

  • added a stale-baseline regression test using a temporary baseline with a phantom entry
  • added whitespace-tolerant matching for .fetchall ()
  • added a --print-baseline helper and included the regenerate command in the stale-baseline failure message
  • added a header comment to scripts/baselines/fetchall_existing.txt marking it as a migration artifact expected to shrink

Validation:

bash scripts/check_fetchall.sh
OK: no new unannotated .fetchall() calls in node/.
Legacy baseline count: 182 (issue #6627 migration backlog).

python3 -m pytest tests/test_fetchall_guard.py -q
6 passed in 48.40s

python3 -m py_compile tests/test_fetchall_guard.py
git diff --check

Copy link
Copy Markdown
Contributor

@jaxint jaxint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thanks for contributing.

@Scottcjn Scottcjn merged commit 24de7ef into Scottcjn:main Jun 4, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) size/L PR: 201-500 lines tests Test suite changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants