audit: implement LEP-6 heal-op lifecycle and recheck handlers#120
audit: implement LEP-6 heal-op lifecycle and recheck handlers#120j-rafique wants to merge 3 commits intoLEP-6-shadow-scoringfrom
Conversation
- Implement storage-truth heal-op lifecycle and recheck handlers - Scope LEP-6 heal op lifecycle to PR4
All 3 previously flagged issues have been addressed in aead156. No new issues found.
Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues. |
51ea7b0 to
c86ef0e
Compare
Production-gate review by Zee — 6 findingsMethodology: full file-by-file read of every non-generated changed file in this PR's diff (pr-120 vs its base branch), cross-checked against:
Status legend: each finding's status is computed at the PR #122 stack-tip (consensus-gap-fixes commit Severity breakdown: CRITICAL=1, HIGH=3, MEDIUM=2 120-F1 —
|
There was a problem hiding this comment.
Pull request overview
Implements the LEP-6 PR4 “heal-op lifecycle” and extends storage-truth state/scoring in x/audit/v1, including verifier submissions, epoch-end heal-op expiration/scheduling, and score state enrichment (trust band, contradiction/failure tracking).
Changes:
- Added heal-op lifecycle handlers (healer claim, verifier verification, gated recheck evidence) plus new KV state for per-(heal_op, verifier) verification tracking.
- Implemented epoch-end processing for heal ops: expire overdue ops, clear stale ticket pointers, and schedule new ops deterministically by priority/caps.
- Introduced storage-truth scoring pipeline on report ingestion, including new reporter/ticket metadata fields and score update events.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| x/audit/v1/types/keys.go | Adds KV prefixes/keys for heal-op verifier submissions. |
| x/audit/v1/types/events.go | Introduces event type/attribute constants for scoring + heal-op lifecycle. |
| x/audit/v1/types/errors.go | Adds error codes for heal-op lifecycle + recheck evidence validation. |
| x/audit/v1/types/audit.pb.go | Regenerated protobuf Go types (trust band enum, new state fields). |
| x/audit/v1/module/autocli.go | Updates CLI help text to reflect implemented heal-op txs. |
| x/audit/v1/keeper/storage_truth_state.go | Adds keeper getters/setters for heal-op verifier submissions. |
| x/audit/v1/keeper/storage_truth_state_test.go | Expands state round-trip tests and adds heal-op verification round-trip test. |
| x/audit/v1/keeper/storage_truth_scoring.go | Adds scoring implementation applied during report ingestion. |
| x/audit/v1/keeper/storage_truth_scoring_internal_test.go | Unit tests for scoring helper functions. |
| x/audit/v1/keeper/msg_submit_epoch_report.go | Wires scoring into SubmitEpochReport. |
| x/audit/v1/keeper/msg_submit_epoch_report_storage_truth_scores_test.go | Comprehensive tests for scoring behavior, decay, events, contradictions. |
| x/audit/v1/keeper/storage_truth_heal_ops.go | Implements epoch-end heal-op expiry + scheduling logic. |
| x/audit/v1/keeper/storage_truth_heal_ops_test.go | Tests for scheduling priority, expiry, and rescheduling after expiry. |
| x/audit/v1/keeper/msg_storage_truth.go | Implements ClaimHealComplete, SubmitHealVerification, and gated SubmitStorageRecheckEvidence. |
| x/audit/v1/keeper/msg_storage_truth_test.go | Tests tx validation/authorization and heal-op lifecycle flows. |
| x/audit/v1/keeper/msg_storage_truth_placeholders.go | Removes prior placeholder Msg server implementations. |
| x/audit/v1/keeper/msg_storage_truth_placeholders_test.go | Removes placeholder-only tests. |
| x/audit/v1/keeper/query_storage_truth_test.go | Updates queries for new state fields and adds an ingestion->query reflection test. |
| x/audit/v1/keeper/abci.go | Runs heal-op epoch-end processing in EndBlocker. |
| proto/lumera/audit/v1/audit.proto | Adds ReporterTrustBand and extends persisted scoring state messages. |
| app/proto_bridge.go | Registers new enum for proto bridge. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if result != nil { | ||
| if isStorageTruthFailureClass(result.ResultClass) && epochID != state.LastFailureEpoch { | ||
| nextState.LastFailureEpoch = epochID | ||
| nextState.RecentFailureEpochCount = updateRecentFailureEpochCount(state, epochID, k.GetParams(ctx).WithDefaults()) | ||
| } else if !found { | ||
| nextState.RecentFailureEpochCount = 0 | ||
| } |
There was a problem hiding this comment.
applyTicketDeteriorationDelta doesn’t increment LastFailureEpoch/RecentFailureEpochCount for the first failure when epochID == 0 because the guard epochID != state.LastFailureEpoch is false for a zero-value (not-found) state. This causes epoch-0 failures to be recorded with RecentFailureEpochCount == 0, breaking repeated-failure escalation and the epoch-0 carryover behavior in later epochs. Consider treating the not-found case specially (initialize LastFailureEpoch=epochID and RecentFailureEpochCount=1 for failure classes) or changing the condition to allow updates when !found.
| nextState := types.ReporterReliabilityState{ | ||
| ReporterSupernodeAccount: reporterAccount, | ||
| ReliabilityScore: next, | ||
| LastUpdatedEpoch: epochID, | ||
| TrustBand: reporterTrustBandForScore(next, k.GetParams(ctx).WithDefaults()), | ||
| ContradictionCount: state.ContradictionCount + contradictionIncrements, | ||
| } |
There was a problem hiding this comment.
applyReporterReliabilityDelta and applyTicketDeteriorationDelta re-fetch params (k.GetParams(ctx).WithDefaults()) while applyStorageTruthScores already loaded params once. This adds repeated store reads in the per-result hot path and can also create subtle inconsistencies if params are ever mutated within a block. Prefer passing the already-loaded params (or the specific thresholds/decays needed) into these helpers and using it for reporterTrustBandForScore / updateRecentFailureEpochCount.
| func (k Keeper) ProcessStorageTruthHealOpsAtEpochEnd(ctx sdk.Context, epochID uint64, params types.Params) error { | ||
| healOps, err := k.GetAllHealOps(ctx) | ||
| if err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
ProcessStorageTruthHealOpsAtEpochEnd calls GetAllHealOps, which iterates over every heal op ever created. Since heal ops aren’t pruned in PruneOldEpochs, this makes epoch-end processing O(total heal ops) and can grow without bound, increasing EndBlocker time over the life of the chain. Consider iterating only non-final statuses via the existing HealOpByStatus index (SCHEDULED / IN_PROGRESS / HEALER_REPORTED), and/or adding a pruning strategy for finalized/expired heal ops.
| sdkCtx.EventManager().EmitEvent( | ||
| sdk.NewEvent( | ||
| types.EventTypeHealOpHealerReported, | ||
| sdk.NewAttribute(sdk.AttributeKeyModule, types.ModuleName), | ||
| sdk.NewAttribute(types.AttributeKeyHealOpID, strconv.FormatUint(healOp.HealOpId, 10)), | ||
| sdk.NewAttribute(types.AttributeKeyTicketID, healOp.TicketId), | ||
| sdk.NewAttribute(types.AttributeKeyHealerSupernodeAccount, req.Creator), | ||
| sdk.NewAttribute(types.AttributeKeyTranscriptHash, req.HealManifestHash), | ||
| ), |
There was a problem hiding this comment.
EventTypeHealOpHealerReported emits AttributeKeyTranscriptHash but the value is HealManifestHash from MsgClaimHealComplete. This makes the event payload ambiguous for indexers/consumers (a “transcript” hash vs a “heal manifest” hash are distinct concepts in the API). Consider adding a dedicated attribute key (e.g., heal_manifest_hash) and emitting that instead of reusing transcript_hash.
| @@ -0,0 +1,33 @@ | |||
| package types | |||
|
|
|||
| // Event types and attributes for storage-truth score updates. | |||
There was a problem hiding this comment.
File header comment says these are “Event types and attributes for storage-truth score updates”, but the constants also cover heal-op lifecycle and recheck evidence events. Updating the comment to reflect the broader scope will prevent confusion when adding/consuming events later.
| // Event types and attributes for storage-truth score updates. | |
| // Event types and attributes for storage-truth score updates, heal-op lifecycle, and recheck evidence events. |
Summary
This PR implements LEP-6 PR4 (
heal-op lifecycle) inlumera: on-chain heal operation transitions, verifier-driven finalization, and epoch-end expiration/scheduling for self-heal ops.The rollout remains non-breaking and keeps deferred LEP-6 enforcement behavior out of scope.
What’s Implemented
1) Heal-op tx lifecycle
Added/implemented keeper logic for:
MsgClaimHealCompleteMsgSubmitHealVerificationMsgSubmitStorageRecheckEvidence(validated + wired, intentionally gated as not active in this milestone)Behavior:
SCHEDULED/IN_PROGRESS->HEALER_REPORTEDHEALER_REPORTED->VERIFIED(all required positives) orFAILED(any negative)active_heal_op_idcleared; verified path updates probation/last-heal fields)2) Epoch-end heal-op processing
Implemented epoch-end lifecycle execution:
deadline_epoch_id <= current_epoch)3) State model integration
Integrated with existing LEP-6 storage-truth state surfaces:
HealOp+ status/ticket indexes(heal_op_id, verifier)Out of Scope (Deferred)
Key Files
x/audit/v1/keeper/msg_storage_truth.gox/audit/v1/keeper/storage_truth_heal_ops.gox/audit/v1/keeper/abci.gox/audit/v1/keeper/msg_storage_truth_test.gox/audit/v1/keeper/storage_truth_heal_ops_test.goTesting
Added/updated tests for:
Validation run:
go test ./x/audit/v1/... -count=1