WIP - WAL by denik · Pull Request #5149 · databricks/cli

denik · 2026-04-30T14:29:10Z

Changes

Why

Tests

Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>

Signed-off-by: Varun Deep Saini <deepsainivarun@gmail.com>

Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>

fix Open() calls; replace Finalize() with Close(); close state file in plan

Move state open/close management to process.go so the lifecycle is transparent. process.go opens state for read (with WAL recovery) after PullResourcesState and defers close. Deploy/destroy upgrade to write mode via the new UpgradeToWrite() method which initializes the WAL without re-reading state JSON. Internal functions (CalculatePlan, ExportState, InitForApply, ValidatePlanAgainstState) no longer manage their own open/close — they expect state to already be open. Self-managed callers (bind, migrate, yaml_sync, diff) handle their own state lifecycle. Plan command uses ProcessBundleRetWithPlan to compute the plan while state is still open for read inside processBundleRetInternal. Co-authored-by: Isaac

Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

If only the WAL header was written (no resource changes), replayWAL now discards the WAL without saving the state file. This avoids the spurious "Updating deployment state..." message on no-op deploys in the direct engine. Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

The splits were introduced because the direct engine printed "Updating deployment state..." on deploys with no resource changes, while terraform did not. The preceding commit fixes the root cause (WAL without entries no longer writes the state file), so both engines now produce identical output for these tests. Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

Populate stateIDs from State on Reload so it always mirrors the effective view: initialized from disk, updated by SaveState/DeleteState. GetResourceID now consults stateIDs unconditionally instead of checking walFile and falling back to State. Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

…roach State opened for read in ProcessBundleRet stays open after return. Deploy and Destroy call UpgradeToWrite + Close internally, so no defensive defer is needed. plan.go reverts to the two-step pattern from main: ProcessBundleRet then phases.RunPlan. ProcessBundleRetWithPlan and opts.ComputePlan are removed. Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

…ternal Close(ctx) -> Finalize(ctx) to match main's naming. plan was a named return in processBundleRetInternal only to support the now-removed ProcessBundleRetWithPlan; demote it to a local variable. Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

- Collapse processBundleRetInternal back into ProcessBundleRet (named returns) - ProcessBundle calls ProcessBundleRet like on main - Restore needDirectState guard so state is only opened when needed - Move var plan back after the state block Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

- migrate.go: use len(state) instead of len(StateDB.Data.State) since Finalize() resets Data after saving; fixes "Migrated 0 resources" regression - dashboard.go, diff.go: remove unnecessary defer StateDB.Finalize for read-only opens - no WAL file is open so no cleanup is needed - bind.go, state_test.go, upload_state_for_yaml_sync.go, migrate.go: fix errcheck lint issues on Finalize calls that cannot return error in read mode Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

…ation Key fixes to bundle/direct/dstate/state.go: - Change walSuffix from ".WAL" to ".wal" for case-sensitive filesystem compatibility (Linux). All acceptance tests create .wal files. - Remove CLIVersion and StateVersion checks from validateWALHeader. These fields are informational metadata and should not gate WAL recovery. Pre-created test WAL files without these fields now validate correctly. - Fix validateWALHeader error messages: - Lineage mismatch: "WAL lineage (%s) does not match state lineage (%s)" - Stale serial (< expected): return errStaleWAL sentinel - Future serial (> expected): "WAL serial (%d) is ahead of expected (%d), state may be corrupted" - Add errStaleWAL sentinel so replayWAL can silently delete stale WALs instead of returning an error that fails the deploy. - Fix mergeWalIntoState for partial recovery: - Skip corrupted entries with log.Warnf instead of failing immediately - Save corrupted lines to PATH.wal.corrupted for debugging - Update db.stateIDs alongside db.Data.State when applying WAL entries - Add MkdirAll before WAL file creation in Open and UpgradeToWrite. Fixes bind failing on first use when the state directory doesn't exist yet. - Wrap replayWAL errors as "WAL recovery failed: %w". - Wrap Open's replayWAL call as "reading state from %s: %w". Update acceptance test expected outputs accordingly: - WAL tests: new error messages, partial recovery behavior, stale WAL handling - Migrate tests: migration count now correct (was 0 due to earlier fix) - Bind tests: now succeed when state directory didn't exist before - State/deploy failure tests: "Updating deployment state..." removed when no state was written (no-change or failed deploys) Co-authored-by: Isaac

Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

denik temporarily deployed to test-trigger-is April 30, 2026 14:29 — with GitHub Actions Inactive

denik temporarily deployed to test-trigger-is May 1, 2026 11:49 — with GitHub Actions Inactive

denik temporarily deployed to test-trigger-is May 1, 2026 14:04 — with GitHub Actions Inactive

Varun Deep Saini and others added 24 commits May 1, 2026 16:10

Add WAL for direct deployment state recovery

e5b7040

Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>

Updated tests and enhanced kill caller with an offset

54cfd9d

Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>

Updated existing tests

641a677

Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>

test fixes

551193d

Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>

Fixes

40d2abf

Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>

fixed tests

1a42275

Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>

updated tests

3af0484

Signed-off-by: Varun Deep Saini <deepsainivarun@gmail.com>

dedup

75fe112

Signed-off-by: Varun Deep Saini <deepsainivarun@gmail.com>

Update WAL corrupted entry outputs

706cc67

WIP

6dfb556

Updated tests and enhanced kill caller with an offset

8e6a210

Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>

Updated existing tests

073eb6e

Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>

Merge simplified WAL handling into state.go

670f61e

fix Open() calls; replace Finalize() with Close(); close state file in plan

fixes

6a868a1

fixes

d3c5b1b

rm unnecessary assert

78b4d8a

lint

b4b2e7c

fixes

c0c78c5

Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

lint

042c3c2

Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

restore test

f554779

fmt

087422c

Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

denik added 9 commits May 1, 2026 16:10

restore non-material changes: assertions and comment

68a2d87

Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

deduplicate UpgradeToWrite+defer Finalize in Deploy

c4a8350

Co-authored-by: Denis Bilenko <denis.bilenko@databricks.com>

update out.test.toml

864a2b9

denik force-pushed the denik/wal-review1 branch from 4e30122 to 864a2b9 Compare May 1, 2026 14:12

denik temporarily deployed to test-trigger-is May 1, 2026 14:13 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP - WAL#5149

WIP - WAL#5149
denik wants to merge 33 commits intomainfrom
denik/wal-review1

denik commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

denik commented Apr 30, 2026

Changes

Why

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants