Local network setup for indexing payments#69
Draft
RembrandtK wants to merge 30 commits intomainfrom
Draft
Conversation
Adds the deployment path for the GIP-0088 contract bundle (REO + IssuanceAllocator + RecurringAgreementManager): - New graph-contracts-issuance container for the Phase 4/5 deployment, wired after graph-contracts-horizon and running the issuance package deploy sequence (REO, IA, RAM, activation). - Rename existing graph-contracts container to graph-contracts-horizon to distinguish it from the new issuance container. Dev-override files split correspondingly into graph-contracts-only.yaml and graph-contracts-issuance.yaml. - Rename the Kafka topic from indexer_daily_metrics to eligibility_oracle_state to match the REO aggregator output name. - Contract naming: the issuance deploy produces RewardsEligibilityOracleA (and B/Mock variants); consumers updated to read the A variant from issuance.json. - Horizon compatibility: use getStake instead of hasStake in indexer-agent run.sh.
- Add KAFKA_TOPIC_ENVIRONMENT optional env var that all producers and consumers append to their topic names (e.g. gateway_queries_local). Leave empty for default topic names. All consumers must agree on the value; centralised in shared/lib.sh via kafka_topic() helper. - Run redpanda as root so rpk topic bootstrap operations can write to the data directory without permission errors.
- Enable dipper's Redpanda signal consumer: kafka.brokers, topic, and consumer_group in generated config.json. - Fix dipper config to resolve recurring_collector address from the horizon address book (moved to a different JSON file layout). - Enable indexer-service DIPs gRPC server (listen on DIPS_PORT, expose to local-network consumers). - Map chain ID 1337 to hardhat in dipper's additional_networks so the local hardhat chain is recognised. - Remove docs/indexing-payments/RecurringCollectorDeployment.md — superseded by graph-contracts-issuance container deployment flow.
GHCR packages for dipper-service and subgraph-dips-indexer-selection are not published, so point both versions at :local tags built from sibling repos. Also enable the indexing-payments profile by default on this branch.
Switches four runtime services from clone-and-build wrappers
(FROM debian:bookworm-slim + ARG *_COMMIT + cargo build) to thin
image-consumption wrappers (FROM ghcr.io/...:${VERSION}). Each wrapper
now just adds the tools run.sh needs (jq, curl, rpk) and overrides
ENTRYPOINT with the local-network run.sh.
Conversions:
- eligibility-oracle-node → ghcr.io/edgeandnode/eligibility-oracle-node:main.
Updates run.sh for the upstream config schema change
([[blockchain.contracts]]/[[blockchain.chains]] arrays, drop
staleness_threshold_secs) and the contract rename
(RewardsEligibilityOracle → RewardsEligibilityOracleA) across scripts
and docs.
- gateway → ghcr.io/edgeandnode/graph-gateway:sha-50c7081 (pinned to
upstream main HEAD; CI publishes sha-<short> tags only).
- tap-escrow-manager → ghcr.io/edgeandnode/tap-escrow-manager:sha-df659cf.
Symlinks /opt/tap-escrow-manager to /usr/local/bin so run.sh can
invoke the binary by name.
- graph-node bumped v0.37.0 → v0.42.1.
- indexer-tap-agent bumped v1.12.2 → v2.0.0.
Env var renames: *_COMMIT → *_VERSION for each converted dep.
Profile rename: rewards-eligibility → eligibility-oracle (service name
eligibility-oracle-node retained to keep the contract-vs-node
distinction visible). Env var ELIGIBILITY_ORACLE_VERSION renamed to
ELIGIBILITY_ORACLE_NODE_VERSION for the same reason.
Extend CONTRACTS_COMMIT from short sha to full 40-char sha.
Dev-override restructure: drop bundled graph-contracts.yaml (which
mixed contracts + subgraph concerns), rename
graph-contracts-only.yaml → graph-contracts-horizon.yaml, add new
network-subgraph.yaml for the subgraph-deploy override alone, rename
GRAPH_CONTRACTS_SOURCE_ROOT → NETWORK_SUBGRAPH_SOURCE_ROOT to match
what it actually points at. Add note to compose/dev/README.md that
image-tag consumption is preferred over these overrides, which are not
all recently tested.
COMPOSE_PROFILES default includes all four profiles; comment updated
to flag that indexing-payments requires GHCR auth.
Note: graphprotocol/rewards-eligibility-oracle is a *different*
Python-based project; the local-network dep is the Rust one at
edgeandnode/eligibility-oracle-node.
Sequential cast send calls with --confirmations=0 returned before the tx was visible in chain state, so the next tx was built with a stale nonce and got 'nonce too low' from the chain. Default cast send behaviour waits for the tx receipt, which serializes the approve/mint pairs correctly. Cascading effect: when start-indexing died partway through the approve+mint curation loop, allocations were never created, which starved dipper's topology fetch (empty gateway API responses interpreted as 'failed to fetch subgraphs info' and retried indefinitely).
…s CI Match graphprotocol/contracts' own CI setup action (.github/actions/setup/action.yml): - node 22 (was 23) - apt: libudev-dev libusb-1.0-0-dev (native deps of hardhat-secure-accounts/ledger toolchain) - corepack enable only; pnpm version resolved per-directory from each project's packageManager field (pnpm 10.x for Horizon, pnpm 9.0.6 for the DataEdge snapshot — corepack downloads on demand) - pnpm install --frozen-lockfile (was --ignore-scripts; the flag was a workaround for the missing libudev, not an intentional choice) - yarn@1.22.22 prepared just-in-time for the TAP step, not globally Verified by building the image and running horizon Phase 1-3 plus the issuance container end-to-end against a fresh chain.
…4 services
Previously a single `graph-contracts-horizon` container ran three
independent deploys sequentially (horizon/subgraph-service, legacy TAP,
DataEdge) and a separate `graph-contracts-issuance` container duplicated
the full contracts clone+build. The split:
- `graph-contracts` — Phase 1: horizon + subgraph-service
- `graph-contracts-issuance` — GIP-0088 (REO + IA + RAM + activation)
- `graph-contracts-tap` — legacy TAP contracts (separate repo)
- `graph-contracts-data-edge` — DataEdge (older pinned contracts snapshot)
All four services share a single multi-stage Dockerfile at
containers/core/graph-contracts. `base` and `contracts-src` stages are
shared: `contracts` and `issuance` both `FROM contracts-src`, so the
graphprotocol/contracts workspace is cloned, installed, and built exactly
once instead of twice. `tap` and `data-edge` share only `base` since they
use different repos/commits. Each compose service picks its stage via
`build.target`.
Runtime dependency graph:
chain
├─► graph-contracts ─┬─► graph-contracts-issuance
│ └─► graph-contracts-tap
└─► graph-contracts-data-edge
`graph-contracts` and `graph-contracts-data-edge` run in parallel; after
`graph-contracts` completes, `graph-contracts-issuance` and
`graph-contracts-tap` run in parallel. Previously all four deploys were
serialized inside one container.
Downstream `depends_on` updated per service:
- block-oracle → graph-contracts + graph-contracts-data-edge
- indexer-agent → graph-contracts + graph-contracts-tap
- subgraph-deploy → graph-contracts + graph-contracts-tap + graph-contracts-data-edge
- tap-aggregator → graph-contracts + graph-contracts-tap
- ready → all four contract services
Services whose contract dependency flows transitively through
subgraph-deploy or indexer-agent (gateway, indexer-service, tap-agent,
tap-escrow-manager, etc.) needed no changes.
Also renames the dev overlay `compose/dev/graph-contracts-horizon.yaml`
to `graph-contracts.yaml` and updates references in `.env`,
`compose/dev/README.md`, and `graph-contracts-issuance.yaml`.
Verified end-to-end: all four contract services deploy cleanly against a
fresh chain in the expected parallel order, and subgraph-deploy +
indexer-agent + tap-aggregator all successfully read the produced
address books (horizon.json, subgraph-service.json, tap-contracts.json,
block-oracle.json, issuance.json) and start normally.
…-contracts DataEdge was previously cloned from an older contracts commit (bdc66135e7700e9a4dcd6a4beac585337fdb9c21) because that was the last commit where packages/data-edge built under pnpm 9 + hardhat v2 + ethers v5 with the @tenderly/hardhat-tenderly plugin. Everything else in the repo moved to pnpm 10 + ethers v6 and newer hardhat plugins, but packages/data-edge has since been migrated upstream — it now builds cleanly as part of the current CONTRACTS_COMMIT workspace, with no Tenderly plugin (eliminating a noisy 500 error we were getting every deploy). The contract source (DataEdge.sol / EventfulDataEdge.sol) is essentially identical across the two commits — only NatSpec comments differ — so switching to the current commit deploys the same bytecode. Consequences: - `data-edge` stage dropped from the Dockerfile. No separate clone, no pnpm 9 corepack dance, no second contracts install. - `graph-contracts-data-edge` compose service removed. - `data-edge.run.sh` deleted; its logic moves into `contracts.run.sh` as a second phase that runs from /opt/contracts/packages/data-edge (already built by the shared `contracts-src` stage). - `block-oracle.json` is now written by `graph-contracts` itself. - Downstream `depends_on: graph-contracts-data-edge` references (block-oracle, subgraph-deploy, ready) replaced with the existing `graph-contracts` dependency — no new edges, just fewer. Verified end-to-end: graph-contracts deploys Phase 1 + Phase 2 in sequence, block-oracle.json is written with the DataEdge address, and subgraph-deploy successfully consumes it to deploy the block-oracle subgraph. Net: 4 contract services → 3, one duplicate contracts clone eliminated, Tenderly error noise gone.
… conflicts Rootless Docker's RootlessKit port manager races on common ports (8081, 8082, 9092, 9644) during concurrent container startup. Move Redpanda host-published ports to 18xxx/19xxx range and drop the internal Kafka listener (9092) host mapping entirely — host access uses the EXTERNAL listener on 29092. Decouple REDPANDA_KAFKA_PORT from run.sh scripts: all container-to- container Kafka connections now hardcode the internal port 9092 instead of referencing an env var that was conflating host and internal ports.
Tests assumed only one active allocation per deployment, causing "Already allocating to the subgraph deployment" errors when duplicates existed. Now close all active allocations for the target deployment before recreating. Also batch block mining via anvil_mine(count, 12) instead of per-block evm_increaseTime + evm_mine (2N → 1 RPC call per chunk), and reduce unnecessary epoch advances (pre-existing allocations don't need 2 epoch advances to close, and creating allocations needs no advance at all).
…e, OracleA) Upstream contracts renamed getRewardsEligibilityOracle to getProviderEligibilityOracle and the deployment key from RewardsEligibilityOracle to RewardsEligibilityOracleA.
…rage Tests need to pause/unpause the REO contract. Grant PAUSE_ROLE to ACCOUNT0 during contract setup (via ACCOUNT1 which holds GOVERNOR_ROLE).
Replace the default serial group (all 34 tests sequential) with named groups so non-conflicting tests run in parallel: - serial(alloc): allocation/denial/rewards tests (16 tests) - serial(reo): REO governance config tests (11 tests) - serial(staking): stake/provision tests (3 tests) - no serial: pure reads + reverts (14 tests) The three serial groups run independently, so fast reo/staking tests no longer wait behind slow epoch-advancing allocation tests. Also make contract_not_paused self-healing: if a prior test left the REO paused (e.g. pause_blocks_writes interrupted by --fail-fast), it unpauses to recover rather than failing.
Anvil v1.0.0 (April 2025) prunes historical state aggressively despite --preserve-historical-states / --slots-in-an-epoch / --transaction-block-keeper flags — empirically only ~15 blocks retained vs ~10 without (per the AnvilHistoricalStateRetention task). Graph-node hits BlockOutOfRangeError on per-block eth_calls during test runs, kills its block stream with a spurious 'possible reorg detected' loop, and never recovers. Foundry shipped a state-retention fix between 1.0.0 and 1.5.0. Verified 2026-04-29 against ghcr.io/foundry-rs/foundry:stable (anvil 1.5.1): eth_getBalance and eth_getCode succeed at all probed blocks 1..3000 after mining 3000 blocks, vs old anvil where only the head block is queryable. Bumps the four foundry pins consistently (chain runtime, indexer-agent / start-indexing / graph-contracts cast tooling) and drops the now-vestigial anvil flags from chain/run.sh — they were no-ops on v1.0.0 and aren't needed on :stable.
…re-before-assert Adds TestNetwork::ensure_active_allocation() that returns an active allocation, creating one from a closed deployment if a prior test panicked before restoring. Tests that previously started with get_allocations + filter-for-active now fail gracefully when state is dirty instead of cascading failures through the suite. REO governance tests that toggle validation / eligibility-period / oracle-timeout now restore state before asserting, so a failing assertion no longer leaks state into the rest of the run.
Replaces the per-block evm_increaseTime + evm_mine pair with a single anvil_mine call that advances 12s per block internally. Halves the RPC round-trips and drops the per-chunk subgraph-catchup wait (not needed once the chain retains historical state).
…istener Deploys the graphprotocol/indexing-payments-subgraph alongside the other protocol subgraphs, via multi-stage COPY from a per-branch image built with `just build-image` in that repo's worktree (INDEXING_PAYMENTS_SUBGRAPH_VERSION). Connects dipper's chain_listener to the deployed subgraph so agreements transition from Created to AcceptedOnChain when indexers accept on-chain, instead of expiring. Adds indexer-agent -> subgraph-deploy compose dependency so the agent observes the indexing-payments deployment at startup and marks it as offchain via INDEXER_AGENT_OFFCHAIN_SUBGRAPHS. Without this the reconciler pauses the subgraph (no allocation, no rule) and chain_listener stalls. Also exports INDEXING_PAYMENTS_SUBGRAPH_ENDPOINT to indexer-agent so its unconditional indexingPaymentsSubgraph SubgraphClient construction has an endpoint to read. Without it, the client throws 'Cannot read properties of undefined (reading status)' on startup before the management API comes up.
Extend the helper to query the network subgraph for a signalled deployment when the management API has no allocations at all (closed or active). Replace inline active-allocation lookups in close_allocation_collects_rewards and the poi_normal_claim restore step with ensure_active_allocation calls. Preserve the close-all-active-allocs loop in close_allocation_collects_rewards (matching close_and_recreate_allocation): indexer-agent may auto-create extra allocations on the same deployment, so closing only the one returned by ensure_active_allocation would leave a stale active alloc that breaks the subsequent create_allocation with "Already allocating".
The audit-fix-2 REO has no whenNotPaused guards, so setEligibilityValidation and renewIndexerEligibility succeed while paused. Update pause_blocks_writes to verify both writes complete (not revert) during pause and after unpause.
…subgraph}, dipper PRs landed 2026-04-30 (indexer#1209, indexer-rs#1028, indexing-payments-subgraph#8) added workflow_dispatch to the publish workflows, enabling :sha-<short> tags for the DIPs integration branches. Switch INDEXER_AGENT_VERSION, INDEXER_SERVICE_RS_VERSION, INDEXER_TAP_AGENT_VERSION, INDEXING_PAYMENTS_SUBGRAPH_VERSION, DIPPER_VERSION from `local` to those published shas, removing the need for parallel `just build-image` workflows in source-clone worktrees. scripts/deps.sh (the source-clone status/pull/build helper) is no longer needed; moved out of the repo to ../deps.sh during this turn.
IISA changes are now merged to main and published as v2.3.0. Drop the local-build requirement and consume the released image instead.
graphprotocol/contracts pinned engines.node ^24 in d29ea286e (.nvmrc + package.json engines field). pnpm install --frozen-lockfile against any post-d29ea286e CONTRACTS_COMMIT now refuses node 22 with ERR_PNPM_UNSUPPORTED_ENGINE. Bump the base stage to node:24-bookworm-slim to keep contracts-src builds working.
Picks up the SS-side localNetwork governor fix (7453b59b8) that aligns DisputeManager / SubgraphService ProxyAdmin ownership with ACCOUNT1, the account issuance.run.sh signs upgrade txs with. Without this, the GIP-0088 upgrade phase reverted with OwnableUnauthorizedAccount mid-batch. Also includes the migrate-config governor bumps (2c07eed7f horizon, 3117e9433 SS) which are not load-bearing for local-network but keep the sibling configs consistent with the m.getAccount(1) convention. Drop the over-specific reo-deployment-3 comment in favour of a generic note. Stack verified: docker compose down -v && up -d completes cleanly, graph-contracts-issuance runs through all four GIP-0088 phases (deploy, configure, transfer, upgrade) with 44 contracts synced.
The indexer-agent's auto-reconciler maintains an allocation per
discovered subgraph deployment. Convenient for human use of
local-network, but the integration tests close+recreate allocations
explicitly and race the reconciler — the agent recreates an allocation
between a test's close and create, and the test fails with `Already
allocating to the subgraph deployment`.
Activate this override for test runs to keep the agent in manual mode:
COMPOSE_FILE=docker-compose.yaml:compose/dev/manual-allocation.yaml \\
docker compose up -d
Verified locally: removes 2 of 4 cluster A failures from the test
suite; baseline 38/6 → 39/5 (only `close_and_recreate_allocation` and
`poi_allocation_too_young` still trip on auto-allocator state).
Root justfile wraps the high-traffic ops (up/down/logs, restart, reset, connect, mine, advance-epoch, test). tests/justfile default switched from running tests to listing recipes.
…iles Versions are always supplied via compose build args from .env; adding a 'latest' default would mask misconfiguration.
…ring Bumps gateway to main's pin, which includes #1179 removing horizon transition code and tap v1 compat. Drops the now-unused legacy_dispute_manager / legacy_verifier from gateway config and the matching legacy address stubs from contracts.run.sh. Drops receipts_verifier_address (V1, deprecated/ignored by indexer-rs since #929) from indexer-service and tap-agent configs. [horizon] enabled = true blocks remain — still required at the pinned indexer-rs sha-853f303 (validation drops in upstream #1014, not yet in the DIPs branch).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Now partly merged to main and
gip-88branch being worked on as replacement.