feat(integrations): Add VAREK Guardrails + W&B pipeline verification …#620
Open
kwdoug63 wants to merge 2 commits into
Open
feat(integrations): Add VAREK Guardrails + W&B pipeline verification …#620kwdoug63 wants to merge 2 commits into
kwdoug63 wants to merge 2 commits into
Conversation
…example Adds a script-based integration example demonstrating how to use VAREK Guardrails (open-source seccomp-bpf + cgroup runtime sandbox) with W&B Runs, Artifacts, Tables, and Weave for verifiable execution of untrusted ML pipeline code. Three demonstration scripts under examples/varek-guardrails/: - verification_artifact.py: 5-test verification battery, logs result as a W&B Artifact for downstream consumption. - telemetry_stream.py: bridges varek_guardrails subscribe_telemetry audit-hook events into W&B run logs, with PID-guarded callback for fork-safety. - benchmark_suite.py: 8-payload regression suite (benign, malicious, resource, edge, known_allowance) with W&B Table for sortable comparison of expectation vs. observed outcome. Tested end-to-end against varek_guardrails 1.1.1, wandb 0.26.1, weave 0.50+ on Ubuntu 24.04 (cgroups v2). Live runs linked in PR description.
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
Removes em-dashes, smart quotes from .py and .sh files to clear GitHub's bidirectional-Unicode warning banner on the PR. README content unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this example demonstrates
This example shows how to verify that AI/ML pipeline code is being executed under deterministic kernel-level isolation (seccomp-bpf + cgroups + wallclock), with all results logged to W&B for audit and reproducibility.
VAREK Guardrails is an open-source runtime sandbox that wraps
subprocess.Popenwith a seccomp-bpf filter, a cgroups v2 subtree (memory/CPU/PIDs), and a wallclock killer. This integration demonstrates three patterns that map directly to W&B-supported workflows.What's included
examples/varek-guardrails/
├── README.md
├── requirements.txt
├── verification_artifact.py # 5-test verification battery → W&B Artifact
├── telemetry_stream.py # subscribe_telemetry → W&B run logs
├── benchmark_suite.py # 8-payload regression suite → W&B Table
└── scripts/
└── with-cgroup.sh # idempotent cgroup subtree wrapper
Why W&B users benefit
For teams running untrusted-code execution (LLM agents, eval harnesses, code-as-data pipelines), this provides a verifiable audit trail: every claim about isolation is backed by a logged run, and the runs are reproducible by any reviewer with the same versions installed. The Artifact pattern lets downstream stages depend on the verification result; the Table pattern catches policy regressions automatically.
Tested against
varek-guardrails1.1.1wandb0.26.1weave0.50+Live verification runs
End-to-end runs from the development environment (sober-agents entity):
verification_artifact.pytelemetry_stream.pybenchmark_suite.pyNotes for reviewers
scripts/with-cgroup.sh) is required because systemd'sDelegate=directive does not apply to.sliceunits, so cgroup subtree controls must be set explicitly before invocationknown_allowancecategory foros.systempassthrough — preserved deliberately as a regression test against future policy tightening in varek_guardrailsqueue.Queuebuffer to remain safe in post-fork-pre-exec contexts (audit hooks fire in forked children; callingwandb.logdirectly there breaks the seccomp arming)examples/varek-guardrails/) is a suggestion — happy to move under a different parent (e.g.examples/security/orexamples/integrations/) based on maintainer preferenceHappy to iterate on naming, scope, or any of the patterns.