Skip to content

Fix op-geth bootnode peering#396

Open
canercidam wants to merge 3 commits intoimprove-discovery-with-bootnodefrom
caner/op-geth-bootnode-peering
Open

Fix op-geth bootnode peering#396
canercidam wants to merge 3 commits intoimprove-discovery-with-bootnodefrom
caner/op-geth-bootnode-peering

Conversation

@canercidam
Copy link
Member

No description provided.

@claude
Copy link
Contributor

claude bot commented Mar 6, 2026

Claude finished @canercidam's task in 1m 27s —— View job


PR Review: Fix op-geth bootnode peering

  • Gather context and read changed files
  • Analyze changes and provide review

Summary

This PR re-enables bootnode peering for op-geth by uncommenting the trustedPeers logic and working around geth's restriction on hostnames in bootnode URLs (by resolving host.docker.internal from /etc/hosts at runtime via awk).

Issues Found

  1. Bug: Missing --nodiscover fallback — The --nodiscover flag was removed unconditionally, but there's no else branch to add it back when ctx.Bootnode == nil. Every other component in this file (op-reth, reth, op-rbuilder, flashblocks-rpc) adds --disable-discovery when no bootnode is configured. Without this, op-geth will attempt peer discovery in all cases, which is likely unintended. See inline comment with suggested fix.

  2. Fragility of host resolution approach — The runtime awk parsing of /etc/hosts for host.docker.internal is not guaranteed to work on all Docker environments (particularly Linux Docker Engine without extra_hosts). If the entry is absent, it silently produces a malformed enode URL. See inline comment for details.

  3. Hardcoded port 30303 — Already noted with a TODO, but worth flagging as a review concern since the bootnode port is dynamically assigned elsewhere.

Comment on lines +382 to +386
var trustedPeers string
if ctx.Bootnode != nil {
// TODO: Figure out the port dynamically.
trustedPeers = fmt.Sprintf("--bootnodes enode://%s@$(awk '/host.docker.internal/ {print $1; exit}' /etc/hosts):30303 ", ctx.Bootnode.ID)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Missing --nodiscover when no bootnode is configured.

When ctx.Bootnode == nil, trustedPeers is empty and --nodiscover was removed unconditionally. This means op-geth will attempt peer discovery even without a bootnode, which is inconsistent with every other component in this file (op-reth, reth, op-rbuilder, flashblocks-rpc all add --disable-discovery in the else branch).

Suggested change
var trustedPeers string
if ctx.Bootnode != nil {
// TODO: Figure out the port dynamically.
trustedPeers = fmt.Sprintf("--bootnodes enode://%s@$(awk '/host.docker.internal/ {print $1; exit}' /etc/hosts):30303 ", ctx.Bootnode.ID)
}
var trustedPeers string
if ctx.Bootnode != nil {
// TODO: Figure out the port dynamically.
trustedPeers = fmt.Sprintf("--bootnodes enode://%s@$(awk '/host.docker.internal/ {print $1; exit}' /etc/hosts):30303 ", ctx.Bootnode.ID)
} else {
trustedPeers = "--nodiscover "
}

var trustedPeers string
if ctx.Bootnode != nil {
// TODO: Figure out the port dynamically.
trustedPeers = fmt.Sprintf("--bootnodes enode://%s@$(awk '/host.docker.internal/ {print $1; exit}' /etc/hosts):30303 ", ctx.Bootnode.ID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fragile host resolution and hardcoded port.

A few concerns with this approach:

  1. Hardcoded port 30303: The TODO acknowledges this but it's worth noting that if the bootnode's port assignment changes, this will silently break peering.

  2. host.docker.internal via /etc/hosts parsing: This awk command assumes host.docker.internal is in /etc/hosts (Docker Desktop adds it, but on Linux Docker Engine it may not be present without extra_hosts in docker-compose). If the entry is missing, awk silently returns empty, producing a malformed enode URL (enode://<id>@:30303) which could cause op-geth to fail or behave unexpectedly at startup.

  3. Why not use the hostname directly? Other components use ctx.Bootnode.Connect() which resolves via Docker DNS. The original TODO says geth doesn't allow hostnames in bootnode config — could this be worked around with --netrestrict or by using Docker's extra_hosts to ensure a stable IP mapping, rather than runtime shell parsing?

@claude
Copy link
Contributor

claude bot commented Mar 6, 2026

Claude finished @canercidam's task in 2m 26s —— View job


PR Review: Fix op-geth bootnode peering

  • Gather context and review latest changes
  • Analyze code and provide feedback

Summary

The latest push (4f160da) addresses the missing --nodiscover fallback from the previous review — that issue is now fixed.

Remaining Issue

Hardcoded port 30303 is a bug, not just a TODO. The bootnode's host port is dynamically allocated by reservePort() and can differ from the default 30303 due to portOffset or port conflicts. Since the enode URL targets host.docker.internal (the Docker host IP), it needs the host port — the one Docker maps back to the bootnode container. Using the container-internal default will silently fail to connect when the host port differs.

See inline comment for details and suggestions.

Other components solve this by using ctx.Bootnode.Connect() → the {{Service ...}} template → resolveAddr() in local_runner.go, which picks the correct port automatically. Since geth rejects hostnames in enode URLs, this PR bypasses the template system, but the port still needs to come from the template resolution. Possible approaches:

  • Use the existing {{Bootnode}} template function with shell-based DNS-to-IP resolution (op-geth already uses /bin/sh entrypoint)
  • Add a template helper that emits the bootnode IP and resolved host port directly

var trustedPeers string
if ctx.Bootnode != nil {
// TODO: Figure out the port dynamically.
trustedPeers = fmt.Sprintf("--bootnodes enode://%s@$(awk '/host.docker.internal/ {print $1; exit}' /etc/hosts):30303 ", ctx.Bootnode.ID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Hardcoded port 30303 won't match the actual host port.

The bootnode's host port is dynamically allocated via reservePort and may differ from the container-internal default 30303 (e.g., due to portOffset or port conflicts). Since this enode URL targets host.docker.internal (the Docker host IP), it needs the host port, not the container port — the host port is what's mapped back to the bootnode container.

Other components avoid this by using ctx.Bootnode.Connect() which resolves through the template system ({{Service ...}}) and correctly picks the right port. Since geth can't accept hostnames in enode URLs, you're bypassing the template system, but the port still needs to be dynamic.

Consider using the existing {{Bootnode}} template function (local_runner.go:506) combined with shell-based hostname-to-IP resolution, since op-geth already uses /bin/sh as entrypoint. Or add a new template helper that emits just the bootnode IP and host port.

@claude
Copy link
Contributor

claude bot commented Mar 6, 2026

Claude finished @dvush's task in 1m 52s —— View job


PR Review: Fix op-geth bootnode peering

  • Gather context and review latest changes
  • Analyze code and provide feedback

Summary

The latest push (03b4590) switches from /etc/hosts parsing to getent hosts bootnode for resolving the bootnode IP, fixes the --nodiscover fallback, and adds --discovery.v4 when a bootnode is configured. This is a solid improvement over the previous commented-out code.

Analysis

What works well: Using getent hosts bootnode to resolve Docker DNS to an IP is a reasonable workaround for geth's restriction on hostnames in enode URLs. The else branch correctly restores --nodiscover when no bootnode is configured.

Remaining concern

Hardcoded port 30303 — correct today, fragile long-term. Since both op-geth and bootnode run inside Docker, the container port (not the host port) is used, and the bootnode's default container port is indeed 30303 (components.go:1085). So this works. However, it creates a hidden coupling — if the bootnode's default port ever changes, this will silently break. See inline comment for a possible approach using the {{Bootnode}} template with shell-based hostname resolution to keep the port dynamic.

Note: Previous review comments about the host port diverging were overstated for this specific case — both services are in Docker, so the container port (always 30303) is used, not the dynamically allocated host port.

var trustedPeers string
if ctx.Bootnode != nil {
// TODO: Figure out the port dynamically.
trustedPeers = fmt.Sprintf("--bootnodes enode://%s@$(getent hosts bootnode | awk '{print $1}'):30303 --discovery.v4 ", ctx.Bootnode.ID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Hardcoded 30303 is correct today but fragile.

Since both op-geth and bootnode are Docker services, the container port is used (not the host port), and the bootnode's default is 30303 (components.go:1085). So this works. However, if someone changes the bootnode's default port, this will silently break.

Consider extracting the bootnode default port as a constant shared between the bootnode component and this line, or using the {{Bootnode}} template and performing hostname-to-IP resolution in the shell (since /bin/sh is already the entrypoint):

// Example: resolve the hostname from the {{Bootnode}} template output at runtime
trustedPeers = fmt.Sprintf("--bootnodes $(echo '%s' | sed 's/@[^:]*/@'$(getent hosts bootnode | awk '{print $1}')'/' ) --discovery.v4 ", ctx.Bootnode.Connect())

This way the port comes from the template system automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants