Add UID-based nftables firewall for NATS and monit connections#399
Add UID-based nftables firewall for NATS and monit connections#399rkoster wants to merge 7 commits intocloudfoundry:mainfrom
Conversation
Implement a firewall mechanism that restricts NATS (mbus) connections to the bosh-agent process only, using UID-based filtering with nftables. Key changes: - Add platform/firewall package with Manager and NatsFirewallHook interfaces - Implement NftablesFirewall that creates UID-based egress rules - Add GetNatsFirewallHook() to Platform interface - Integrate BeforeConnect hook in nats_handler.go for connection/reconnection - Support DNS re-resolution on reconnect for HA failover scenarios - Add stub implementations for Windows and dummy platforms The firewall rules allow only the agent's UID to connect to NATS/director ports while blocking other processes, improving security posture.
Add comprehensive unit tests for the new firewall functionality: platform/firewall tests (23 tests): - SetupMonitFirewall: table/chain/rule creation, error handling - SetupNATSFirewall: IPv4/IPv6, DNS resolution, https/empty URL handling - BeforeConnect: delegation to SetupNATSFirewall - Cleanup: table deletion and error handling mbus/nats_handler tests (4 new tests): - Firewall hook is called on Start - BeforeConnect receives correct mbus URL - Handler still starts when hook returns nil - Warning logged but no failure when BeforeConnect errors Also: - Add DNSResolver interface for testable DNS resolution - Inject resolver dependency via NewNftablesFirewallWithDeps - Configure test logging to use GinkgoWriter for visibility
- Fix ST1023 linter error: omit type from var declaration - Add linux_header.txt for counterfeiter to add build tags to Linux-only fakes - Regenerate fake_nftables_conn.go and fake_dnsresolver.go with //go:build linux tag - This fixes macOS/Windows build failures due to google/nftables being Linux-only
|
Im a little worried by the general approach of teaching the agent about os-version specific things. Specifically I worry that it will (further?) violate layering by pushing version specific customisation from the stemcell into the agent. Its probably ok here since this somewhat sits between agent setup and stemcell config, but i wanted to at least mention it even if nothing comes of it. |
- Fix nil pointer dereference in DisconnectErrHandler when err is nil - Remove iptables-based SetupNatsFirewall code (replaced by nftables) - Remove unused Cleanup() method from firewall interface - Move firewall initialization from lazy getter to explicit SetupFirewall() - Add comment explaining IPv6 loopback is intentionally not protected (monit only binds to 127.0.0.1:2822)
There currently is nothing OS specific about this feature, because it works on both noble an jammy. So this is an effort to simplify and centralise all the different nats and monit firewall codepaths into the agent, where it can more easily be tested (compared to the stemcell builder). |
The nftables library batches operations until Flush() is called, so AddTable/AddChain/AddRule never return errors. Removing the misleading error return types from these internal helper methods.
Implement separate chains for agent-managed and job-managed monit access rules: - monit_access_jobs: Regular chain for job rules (never flushed by agent) - monit_access: Base chain that jumps to jobs chain first, then applies agent rules This allows BOSH jobs to add their own monit access rules via pre-start scripts that persist across agent restarts, while ensuring agent rules are always up-to-date by flushing and recreating them on each setup call.
Move to firewallfakes/linux_build_constraint.txt to make it clear this file contains a Go build constraint for counterfeiter-generated fakes, not a C header.
Remove the cgroup v1 net_cls-based monit API access control mechanism including the monit wrapper script, helper functions, and iptables rules. The monit binary now runs directly without a wrapper. Access control will be managed by the bosh-agent's internal firewall implementation. Related to cloudfoundry/bosh-agent#399
Stop sourcing monit-access-helper.sh and calling permit_monit_access when starting the bosh-agent. The agent will manage its own firewall access internally instead of using the cgroup-based helper. This completes the removal of the permit_monit_access functionality now that pxc-release (the only consumer) no longer uses it. Related to cloudfoundry/bosh-agent#399 Related to cloudfoundry/pxc-release#97
Remove the static nftables-based monit API access control mechanism. The monit service now runs without firewall restrictions. Access control will be managed by the bosh-agent's internal firewall implementation. Related to cloudfoundry/bosh-agent#399
6e235a0 to
a0f7661
Compare
|
This PR should not be merged before: cloudfoundry/pxc-release#97 has been released. |
Summary
This PR implements a UID-based nftables firewall to protect NATS (mbus) and monit connections, replacing the previous cgroup-based iptables approach.
Motivation
The existing cgroup-based firewall approach has limitations in nested container environments (like Garden containers running on BOSH VMs). In these environments:
The UID-based approach solves this by using nftables with
meta skuidmatching, which works reliably regardless of container nesting.Changes
New
platform/firewallPackagefirewall.go- DefinesManagerandNatsFirewallHookinterfacesnftables_firewall.go- Linux implementation usinggithub.com/google/nftables(netlink-based, no CLI required)nftables_firewall_other.go- Stub for non-Linux platformsFirewall Rules
Creates an nftables table
bosh_agentwith two output chains:monit_accessnats_accessPlatform Integration
GetNatsFirewallHook()toPlatforminterfaceLinuxPlatforminitializes firewall and implements the hookNATS Handler Integration
nats_handler.gocallsBeforeConnecthook before NATS connection and on each reconnectionTesting
Technical Details
github.com/google/nftableslibrary which communicates via netlink (nonftCLI needed)Testing Performed
ginkgo -r platform/firewall mbuscurl 127.0.0.1:2822hangs (blocked)curl 127.0.0.1:2822returns 401 (allowed through firewall)