Skip to content

sql driver tests#2824

Open
zachmu wants to merge 18 commits into
mainfrom
zachmu/sql-driver-tests
Open

sql driver tests#2824
zachmu wants to merge 18 commits into
mainfrom
zachmu/sql-driver-tests

Conversation

@zachmu

@zachmu zachmu commented Jun 5, 2026

Copy link
Copy Markdown
Member

Ported tests from dolt via agent, manually pruned down the ones that aren't applicable or we think are adequately tested by Dolt itself.

Most skips are due to cluster replication not yet working.

@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor
Main PR
covering_index_scan_postgres 1951.58/s 1983.83/s +1.6%
groupby_scan_postgres 133.65/s 132.90/s -0.6%
index_join_postgres 682.43/s 680.48/s -0.3%
index_join_scan_postgres 865.89/s 860.55/s -0.7%
index_scan_postgres 25.90/s 24.59/s -5.1%
oltp_delete_insert_postgres 862.30/s 854.40/s -1.0%
oltp_insert 733.16/s 755.61/s +3.0%
oltp_point_select 3068.96/s 3081.47/s +0.4%
oltp_read_only 3084.48/s 3085.84/s 0.0%
oltp_read_write 2365.33/s 2333.77/s -1.4%
oltp_update_index 778.54/s 766.27/s -1.6%
oltp_update_non_index 823.77/s 822.15/s -0.2%
oltp_write_only 1807.18/s 1794.66/s -0.7%
select_random_points 1970.77/s 1930.26/s -2.1%
select_random_ranges 1145.52/s 1147.97/s +0.2%
table_scan_postgres 24.75/s 23.35/s -5.7%
types_delete_insert_postgres 812.05/s 820.91/s +1.0%
types_table_scan_postgres 8.67/s 8.22/s -5.2%

@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor
Main PR
Total 42090 42090
Successful 18270 18269
Failures 23820 23821
Partial Successes1 5333 5333
Main PR
Successful 43.4070% 43.4046%
Failures 56.5930% 56.5954%

${\color{red}Regressions (1)}$

subselect

QUERY:          select count(*) from tenk1 t
where (exists(select 1 from tenk1 k where k.unique1 = t.unique2) or ten < 0);
RECEIVED ERROR: timeout during Receive

Footnotes

  1. These are tests that we're marking as Successful, however they do not match the expected output in some way. This is due to small differences, such as different wording on the error messages, or the column names being incorrect while the data itself is correct.

@itoqa

itoqa Bot commented Jun 6, 2026

Copy link
Copy Markdown

Ito Test Report ❌

15 test cases ran. 3 failed, 12 passed.

The unified run executed 15 test cases with 12 passing and 3 failing (plus one non-executed verification section with no runnable assertions), with credential/TLS behavior, concurrency regressions, strict unknown-cluster-field parsing, default config loading, skip gating, and error_match validation all behaving as expected. The most important findings were three medium, PR-introduced defects: cluster settings are accepted in config parsing but silently inactive at runtime with no unsupported-feature signal, retry_attempts do not cover restart_server (splitting a single logical assertion across attempts), and malformed query steps with neither query nor exec are incorrectly treated as successful no-ops.

❌ Failed (3)
Category Summary Screenshot
Cluster 🟠 Cluster settings are parsed but runtime cluster/remotesapi behavior is not activated and no explicit unsupported signal is surfaced. CLUSTER-4
Skip 🟠 Retry attempts exclude restart handling, splitting one logical run across server generations. SKIP-1
Skip 🟠 Empty query/exec YAML steps are silently treated as success instead of invalid input. SKIP-4
🟠 Cluster config parses but runtime is silently inactive
  • What failed: The config parses successfully, but runtime cluster behavior is not enabled and no direct unsupported-feature error is emitted for the parsed cluster block, which can mislead users into thinking cluster mode is active.
  • Impact: Users can configure cluster fields that appear valid but do not take effect at runtime. This creates false confidence and increases operational risk when teams expect remotes/replication behavior to be active.
  • Steps to reproduce:
    1. Start doltgres with a config containing cluster.standby_remotes, cluster.bootstrap_role, cluster.bootstrap_epoch, and cluster.remotesapi.
    2. Confirm startup and strict parsing succeed for that configuration.
    3. Attempt to use runtime behavior implied by those cluster settings and check whether an explicit unsupported signal is surfaced.
  • Stub / mock context: This run used a temporary auth bypass by forcing EnableAuthentication off in server/authentication_scram.go, and added a local pg_wal_location marker file to stabilize startup; the cluster parse-versus-runtime mismatch was confirmed by reading production code paths.
  • Code analysis: I examined servercfg/cfgdetails/config.go and found cluster YAML fields are intentionally accepted in config parsing, but the runtime accessor always returns nil for cluster config, so parsed cluster settings cannot be applied by server runtime logic.
  • Why this is likely a bug: Production code accepts cluster configuration shape but definitively drops runtime activation (ClusterConfig() is always nil), creating a parse-success/runtime-no-op mismatch visible to users.

Relevant code:

servercfg/cfgdetails/config.go (lines 194-205)

// ClusterConfig mirrors Dolt's cluster replication configuration. Cluster
// replication is not yet implemented in Doltgres; this is accepted so that
// cluster config files parse (the integration tests for cluster
// replication are ported but skipped until the feature lands). The shape
// mirrors Dolt's ClusterYAMLConfig.
ClusterCfg *DoltgresClusterConfig `yaml:"cluster,omitempty" minver:"TBD"`
}

// DoltgresClusterConfig mirrors Dolt's cluster replication configuration so
// that ported cluster config files parse under UnmarshalStrict.

servercfg/cfgdetails/config.go (lines 567-569)

func (cfg *DoltgresConfig) ClusterConfig() doltservercfg.ClusterConfig {
    return nil
}

integration-tests/go-sql-server-driver/tests/sql-server-remotesapi.yaml (lines 4-8)

# The remotes API (used to serve/clone databases between servers) is not yet
# implemented in Doltgres, so this test is skipped. The config, remotesapi
# port wiring, and dolt_clone/dolt_commit flow are translated faithfully so it
# can be enabled once the feature lands.
skip: "cluster/remotesapi replication not yet implemented in Doltgres"
🟠 Retry flow splits restart semantics
  • What failed: The retry loop reruns only DB connect and query assertions, while server restart runs outside the retried closure; expected single coherent replay behavior is split between pre-restart retries and a one-time restart path.
  • Impact: Retry-based integration coverage for restart scenarios becomes unreliable and can report contradictory pass/fail outcomes. This can hide regressions or raise false failures in CI for core server lifecycle tests.
  • Steps to reproduce:
    1. Configure a connection with retry_attempts > 1 and multiple queries.
    2. Add restart_server on that same connection.
    3. Trigger a first-attempt assertion failure, then observe retry behavior and final state assertions.
  • Stub / mock context: A test-only authentication bypass was enabled by forcing server/authentication_scram.go to disable SCRAM authentication, and temporary Batch 5 YAML fixtures were used to exercise skip/retry edge cases under deterministic local setup.
  • Code analysis: I traced the connection execution flow and found that RetryTestRun wraps only the query loop, while restart_server is executed afterward in non-retried control flow; this conflicts with the documented intent that retry attempts apply to the entire connection assertion.
  • Why this is likely a bug: The implementation retries only a subset of connection behavior despite documentation saying retries cover the entire connection assertion, producing inconsistent semantics by code design.

Relevant code:

integration-tests/go-sql-server-driver/testdef.go (lines 497-510)

if c.RetryAttempts > 1 {
	RetryTestRun(t, c.RetryAttempts, func(t TestingT) {
		db, err := server.DB(c)
		require.NoError(t, err)
		defer db.Close()

		conn, err := db.Conn(context.Background())
		require.NoError(t, err)
		defer conn.Close()

		for _, q := range c.Queries {
			RunQueryAttempt(t, conn, q, &ports)
		}
	})
}

integration-tests/go-sql-server-driver/testdef.go (lines 526-538)

if c.RestartServer != nil {
	args := c.RestartServer.Args
	if args != nil {
		tmplArgs := make([]string, len(*args))
		for i := range tmplArgs {
			tmplArgs[i] = ports.ApplyTemplate((*args)[i])
		}
		prepared := prepareDoltgresServerArgs(t, server.Cmd.Dir, server.Name, server.Port, tmplArgs)
		args = &prepared
	}
	err := server.Restart(args, c.RestartServer.Envs)
	require.NoError(t, err)
}

integration-tests/go-sql-server-driver/driver/server.go (lines 37-42)

// Rarely needed, allows the entire connection assertion to be retried
// on an assertion failure. Use this is only for idempotent connection
// interactions and only if the sql-server is prone to tear down the
// connection based on things that are happening, such as cluster role
// transitions.
RetryAttempts int `yaml:"retry_attempts"`
🟠 Malformed query step passes silently
  • What failed: Malformed query-step input is accepted as a no-op because no validation guard rejects the empty operation shape; expected behavior is to fail fast on invalid test definition.
  • Impact: Invalid test definitions can silently pass, reducing confidence in suite results and allowing regressions to escape detection. Teams may treat green runs as trustworthy when assertions were actually skipped.
  • Steps to reproduce:
    1. Define a YAML query step that sets neither query nor exec.
    2. Execute the test through RunTestsFile.
    3. Observe that the step completes without error and the test can still pass.
  • Stub / mock context: A test-only authentication bypass was enabled by forcing server/authentication_scram.go to disable SCRAM authentication, and temporary Batch 5 YAML fixtures were used to exercise skip/retry edge cases under deterministic local setup.
  • Code analysis: The Query type contract says a step should set query or exec, but RunQueryAttempt only handles the two non-empty branches and has no else that errors when both are empty.
  • Why this is likely a bug: The code path allows structurally invalid query steps to succeed silently, directly contradicting the declared query contract and causing false-positive test outcomes.

Relevant code:

integration-tests/go-sql-server-driver/driver/server.go (lines 235-237)

// The primary interaction of a |Connection|. Either |Query| or |Exec| should
// be set, not both.
type Query struct {

integration-tests/go-sql-server-driver/testdef.go (lines 635-678)

if q.Query != "" {
	ctx, c := context.WithTimeout(context.Background(), timeout)
	defer c()
	rows, err := conn.QueryContext(ctx, q.Query, args...)
	if err == nil {
		defer rows.Close()
	}
	if q.ErrorMatch != "" {
		require.Error(t, err, "expected error running query %s", q.Query)
		require.Regexp(t, q.ErrorMatch, err.Error())
		return
	}
	require.NoError(t, err)
	// ... assertion path omitted ...
} else if q.Exec != "" {
	ctx, c := context.WithTimeout(context.Background(), timeout)
	defer c()
	exec := q.Exec
	exec = ports.ApplyTemplate(exec)
	_, err := conn.ExecContext(ctx, exec, args...)
	if q.ErrorMatch == "" {
		require.NoError(t, err, "error running query %s: %v", q.Exec, err)
	} else {
		require.Error(t, err)
		require.Regexp(t, q.ErrorMatch, err.Error())
	}
}
✅ Passed (12)
Category Summary Screenshot
Cluster Supported cluster schema parsed and server accepted a successful query. CLUSTER-1
Cluster Unknown cluster field was rejected by strict parsing and server did not stay up. CLUSTER-2
Cluster Default config.yaml loaded without --config and query succeeded. CLUSTER-3
Concurrency TestConcurrentDropDatabase passed under contention (GOMAXPROCS=8) with successful create/drop completion and no ghost transient database failures. CONCURRENCY-1
Concurrency TestCommitConcurrency passed; stale transaction path produced serialization conflict/restart guidance while winning transaction path remained valid. CONCURRENCY-2
Concurrency TestConcurrentWrites passed under high concurrency; row and commit totals remained consistent (512 writes, 515 commits), indicating no masked lost-commit scenario in this run. CONCURRENCY-3
Credential Targeted YAML case failed fast with explicit file open error, confirming host-path read attempt is surfaced deterministically and initialization did not continue. CREDENTIAL-1
Credential Connection using password_file=/tmp/credential2-pass.txt passed and SELECT 1 returned expected value, confirming file-derived secret was used successfully. CREDENTIAL-2
Credential Targeted config subtest passed with connection fields omitted, confirming fallback password path authenticated and query returned current_user. CREDENTIAL-3
Credential Ran TLS fixture subtest with driver_params.sslmode=require; connection and query succeeded under TLS configuration. CREDENTIAL-4
Skip Skip reason correctly short-circuits case execution. SKIP-2
Skip Invalid query passes when returned error matches configured regex. SKIP-3

Commit: eb2e505

View Full Run


Tell us how we did: Give Ito Feedback

@itoqa

itoqa Bot commented Jun 10, 2026

Copy link
Copy Markdown

Ito QA test results
Ito Diff Reporteb2e5052d71e73: 16 test cases ran, 0 regressions ❌, 0 new failures ❌, 1 still failing ❌, 2 fixed ✅, 12 passing ✅, 1 additional findings ⚠️.

Diff Summary

This diff run surfaced

Overall, 15 of 17 test cases passed, indicating strong suite health across cluster compatibility signaling, skip-gated coverage behavior, and amend-concurrency stability (including repeated and resource-constrained race runs). The two medium-severity failures were that malformed YAML query steps can pass silently without validation (risking hidden missing assertions) and concurrent CREATE/DROP database activity can leave residual sk7_* schemas, consistent with a real non-atomic schema update race.

Tests run by Ito

View full run

Result State Severity Type Description
❌->❌ Still Failing Medium severity Skip The runner accepted an empty operation step and passed the case, but expected behavior is to reject or fail malformed steps so missing assertions are visible.
❌->✅ Fixed Cluster Cluster configuration parsing completed and the run clearly reported unsupported cluster replication behavior, matching expected signaling.
❌->✅ Fixed Skip Cluster restart/retry coverage is clearly surfaced as unsupported through explicit skip-gating.
Passing Concurrency High-contention winner tracking matched persisted commit and table state checks in the run.
Passing Concurrency Stale amend attempts were rejected with restart guidance while the winning amend remained committed.
Passing Concurrency The racing amend path completed with exactly one winning transaction and consistent final state checks.
Passing Concurrency Losing amend attempts rolled back through expected error paths while the overall race still finished successfully.
Passing Concurrency Repeated racing amend runs maintained single-winner behavior with no duplicate-winner assertion failures.
Passing Concurrency SQL-like commit-message input was stored as literal text and did not execute unintended SQL.
Passing Concurrency Three constrained race runs completed without deadlock or timeout and preserved expected one-winner behavior.
Passing Concurrency Twenty-five repeated race runs passed with no multi-success anomaly detected.
Passing Skip Unsupported-feature signaling remained visible after skip-file cleanup, with skip entries and compatibility parsing both passing.
Passing Skip Deleted skip-only tests are absent from discovery and package-level tests still run successfully.
Passing Skip Verbose suite output still shows explicit unsupported-feature skip breadcrumbs after cleanup.
Passing Skip Skip-gated subtests still allowed adjacent restart-capable scenarios to run, so lifecycle execution ordering remained intact.
⏸️ Skipped Cluster Supported cluster schema parsed and server accepted a successful query.
⏸️ Skipped Cluster Unknown cluster field was rejected by strict parsing and server did not stay up.
⏸️ Skipped Cluster Default config.yaml loaded without --config and query succeeded.
⏸️ Skipped Concurrency TestConcurrentWrites passed under high concurrency; row and commit totals remained consistent (512 writes, 515 commits), indicating no masked lost-commit scenario in this run.
⏸️ Skipped Credential Targeted YAML case failed fast with explicit file open error, confirming host-path read attempt is surfaced deterministically and initialization did not continue.
⏸️ Skipped Credential Connection using password_file=/tmp/credential2-pass.txt passed and SELECT 1 returned expected value, confirming file-derived secret was used successfully.
⏸️ Skipped Credential Targeted config subtest passed with connection fields omitted, confirming fallback password path authenticated and query returned current_user.
⏸️ Skipped Credential Ran TLS fixture subtest with driver_params.sslmode=require; connection and query succeeded under TLS configuration.
⏸️ Skipped Skip Skip reason correctly short-circuits case execution.
⏸️ Skipped Skip Invalid query passes when returned error matches configured regex.
⚠️ Additional Finding Medium severity Skip After concurrent create/drop operations, transient sk7_* databases remained instead of being fully removed. Expected behavior is a clean catalog state with no leftover transient schemas.
Tests that are no longer relevant

Below are tests that previously ran and are no longer relevant:

Type Test Description
Concurrency Concurrent database create and drop remains clean Dropped because Deleted integration-tests/go-sql-server-driver/concurrent_drop_database_test.go that defined TestConcurrentDropDatabase.
Additional Findings Details

These findings are unrelated to the current changes but were observed during testing.

🟡 Concurrent database cleanup leaves residual schemas
  • Severity: Medium Medium severity
  • Description: After concurrent create/drop operations, transient sk7_* databases remained instead of being fully removed. Expected behavior is a clean catalog state with no leftover transient schemas.
  • Impact: Concurrent schema churn can leave stale databases behind, which can pollute environments and cause follow-on failures in database-management workflows. Teams running concurrency-heavy automation may need manual cleanup between runs.
  • Steps to Reproduce:
    1. Run concurrent create/drop database activity against the same server instance.
    2. After workers complete, query the catalog for transient names with the sk7_ prefix.
    3. Observe transient databases remain instead of fully cleaning up.
  • Stub / mock content: Authentication checks were relaxed for this recorded run so the local server could accept test connections (authentication was disabled in server/authentication_scram.go and missing public role handling was tolerated in server/auth/auth_handler.go). No feature-specific API or route mock was applied to this test.
  • Code Analysis: I inspected schema create/drop code paths in core/rootvalue.go and verified both operations use unsynchronized read-modify-write updates (GetSchemas then SetSchemas). With concurrent create/drop calls, interleaving writes can preserve stale entries, which matches the observed leftover sk7_* databases. No test-specific mock or bypass targeted this feature path.

Tip

Reply with @itoqa to send us feedback on this test run.

@zachmu zachmu requested a review from fulghum June 13, 2026 00:10
@itoqa

itoqa Bot commented Jun 13, 2026

Copy link
Copy Markdown

Ito QA test results
Ito Diff Report2d71e7355c7601: 16 test cases ran, 0 regressions ❌, 0 new failures ❌, 1 still failing ❌, 0 fixed ✅, 12 passing ✅, 3 additional findings ⚠️.

Diff Summary

This diff run surfaced

Coverage focused on core database behavior and safety boundaries, including read-only protections, cluster and replication gating, metrics authentication/config handling, startup readiness, and recovery after garbage-collection-related reconnects. The run exercised both normal workflows and edge-case or adversarial conditions around unsupported or misconfigured paths, with overall product behavior remaining stable for the areas touched by this change.

Safe to merge — the issues observed are pre-existing and tied to unsupported or known edge-path behavior rather than regressions introduced by this PR. The changed code appears low risk because critical runtime flows and recovery paths remained healthy, with no new PR-attributable failures.

Tests run by Ito

View full run

Result State Severity Type Description
❌->❌ Still Failing Medium severity Skip The harness does not validate that each query step has exactly one executable operation (query or exec), so malformed steps are silently ignored.
Passing Access The deleted JWT auth YAML suite is no longer discovered by package tests, and the remaining auth test path runs without file-not-found breakage.
Passing Bootstrap The test intent is satisfied: readiness polling logic is present, and the earlier timeout was caused by unsupported JWT metrics config keys rather than a product readiness defect.
Passing Bootstrap TestGCIncremental and TestFullGCNoOldgenConjoin passed in the recorded run, including reconnect behavior after GC with no duplicate side-effect failures.
Passing Cluster Cluster remotes API behavior is surfaced as explicitly unsupported and skip-gated, while cluster config parsing remains intentionally compatible.
Passing Cluster Re-execution in the restored nested runtime confirmed Postgres-style GRANT statements are accepted and executed in the users-and-grants auth path, so this case does not indicate an application bug.
Passing Cluster Targeted re-run of TestClusterReadOnly shows the standby write scenarios are explicitly marked unsupported and skip-gated, which aligns with current cluster feature status rather than a silent product defect.
Passing Gating Read-only create/insert and read-only select YAML scenarios executed as active assertions and passed; only the procedure-specific read-only CALL cases remained intentionally skipped.
Passing Gating Package tests completed with parseable JSON output, no references to sql-server-jwt-auth.yaml in executed or skipped inventory, and no runner/parser failure for remaining auth-related coverage reporting.
Passing Race TestGCIncremental passed across archive-level and file-size combinations, including reconnection checks after GC invalidated prior sessions.
Passing Race The resumable GC scenario surfaced expected incremental-write abort errors and then completed retry validation without evidence of a lingering manifest inconsistency.
Passing Read The read-only server correctly rejected table creation with a read-only error and still allowed SELECT queries after restart.
Passing Read Mutating CALL to dolt_* procedures is explicitly denied in read_only flow, so this expectation passes: procedure execution is rejected with an explicit error rather than silently allowed.
⏸️ Skipped Concurrency Confirmed the current suite snapshot has no active create/drop sentinel to execute, and no ghost-database failure was reproducible.
⏸️ Skipped Concurrency High-contention winner tracking matched persisted commit and table state checks in the run.
⏸️ Skipped Concurrency Stale amend attempts were rejected with restart guidance while the winning amend remained committed.
⏸️ Skipped Concurrency The racing amend path completed with exactly one winning transaction and consistent final state checks.
⏸️ Skipped Concurrency Losing amend attempts rolled back through expected error paths while the overall race still finished successfully.
⏸️ Skipped Concurrency Repeated racing amend runs maintained single-winner behavior with no duplicate-winner assertion failures.
⏸️ Skipped Concurrency SQL-like commit-message input was stored as literal text and did not execute unintended SQL.
⏸️ Skipped Concurrency Three constrained race runs completed without deadlock or timeout and preserved expected one-winner behavior.
⏸️ Skipped Concurrency Twenty-five repeated race runs passed with no multi-success anomaly detected.
⏸️ Skipped Skip Cluster restart/retry coverage is clearly surfaced as unsupported through explicit skip-gating.
⏸️ Skipped Skip Unsupported-feature signaling remained visible after skip-file cleanup, with skip entries and compatibility parsing both passing.
⏸️ Skipped Skip Deleted skip-only tests are absent from discovery and package-level tests still run successfully.
⏸️ Skipped Skip Verbose suite output still shows explicit unsupported-feature skip breadcrumbs after cleanup.
⏸️ Skipped Skip Skip-gated subtests still allowed adjacent restart-capable scenarios to run, so lifecycle execution ordering remained intact.
⚠️ Additional Finding High severity Access Expected behavior is rejecting unauthenticated or invalid-token localhost metrics requests when localhost JWT enforcement is configured. Actual behavior is that the effective runtime accessors always disable this enforcement path.
⚠️ Additional Finding High severity Access The runtime never consumes metrics JWKS settings for token verification, so JWKS/token alignment checks across startup paths cannot be exercised as a functioning auth path.
⚠️ Additional Finding High severity Read The run expected read-only enforcement to reject mutating procedure operations, but select dolt_commit('--allow-empty', '-m', 'msg') returned success and produced a commit hash. This violates the expected read-only deny behavior for state-changing procedure paths.
Additional Findings Details

These findings are unrelated to the current changes but were observed during testing.

🟠 Localhost metrics JWT enforcement bypass
  • Severity: High High severity
  • Description: Expected behavior is rejecting unauthenticated or invalid-token localhost metrics requests when localhost JWT enforcement is configured. Actual behavior is that the effective runtime accessors always disable this enforcement path.
  • Impact: When metrics auth is configured to require JWT on localhost, unauthenticated requests can still reach the metrics endpoint. That exposes operational metrics to anyone who can access the service locally and defeats the intended access control.
  • Steps to Reproduce:
    1. Configure metrics with jwt_required_for_localhost: true.
    2. Start the server and request GET /metrics on localhost without an Authorization header.
    3. Observe that the runtime accessor path cannot enforce JWT because metrics JWKS and localhost JWT-required accessors are hardcoded disabled.
  • Stub / mock content: During this run, the metrics auth suite was forced to execute by replacing its skip guard, and SCRAM auth was temporarily disabled for test setup. Those harness-only changes did not modify the production metrics accessor code used to confirm this defect.
  • Code Analysis: In production code, servercfg/cfgdetails/config.go returns nil from MetricsJwksConfig() and false from MetricsJWTRequiredForLocalhost(), which prevents effective JWT enforcement for localhost metrics regardless of config values. The PR diff modifies integration test files (including metrics_auth_test.go) but does not modify this production accessor file, so the bug is source-backed and pre-existing rather than introduced by changed PR lines.
Evidence Package
🟠 Metrics JWKS verification is hard-disabled
  • Severity: High High severity
  • Description: The runtime never consumes metrics JWKS settings for token verification, so JWKS/token alignment checks across startup paths cannot be exercised as a functioning auth path.
  • Impact: JWKS-backed token verification does not work across startup and reload paths, so requests that rely on signed access tokens can be rejected or accepted without the intended validation guarantees.
  • Steps to Reproduce:
    1. Configure metrics with JWKS-backed auth expectations and start the server through the metrics startup path.
    2. Call GET /metrics with a bearer token signed by the expected private key, then repeat after a startup/reload path.
    3. Inspect runtime config accessors and observe that metrics JWKS lookup and localhost JWT enforcement are hardcoded off.
  • Stub / mock content: The recorded run used QA-only bypasses (temporarily replacing the metrics-auth skip guard and disabling SCRAM auth checks) to attempt execution, but the reported defect is based on unchanged production config-accessor logic.
  • Code Analysis: In servercfg/cfgdetails/config.go, MetricsJwksConfig() returns nil and MetricsJWTRequiredForLocalhost() returns false unconditionally (lines 468-473), so runtime behavior cannot enable metrics JWT verification from configuration. The metrics auth test also documents this unsupported state and is skip-gated in TestMetricsAuth (integration-tests/go-sql-server-driver/metrics_auth_test.go lines 155-161). The smallest practical fix is to wire metrics JWT/JWKS settings through concrete config fields and make these accessors return configured values instead of constants.
Evidence Package
🟠 Read-only CALL mutability mismatch
  • Severity: High High severity
  • Description: The run expected read-only enforcement to reject mutating procedure operations, but select dolt_commit('--allow-empty', '-m', 'msg') returned success and produced a commit hash. This violates the expected read-only deny behavior for state-changing procedure paths.
  • Impact: Read-only sessions can still execute mutating CALL statements, so a user can change database state from a context that should have been protected. This breaks a core access-control expectation for read-only connections.
  • Steps to Reproduce:
    1. Start the server with read_only mode enabled.
    2. Execute the read-only procedure checks for dolt_commit and dolt_reset in the SQL driver integration suite.
    3. Observe that the mutating procedure call succeeds instead of returning a read-only error.
  • Stub / mock content: This verification temporarily removed skip gates for the two read-only CALL checks to execute the real assertions, then restored the test file; no runtime mocks or route interceptions were applied.
  • Code Analysis: In server/node/call.go, Call.IsReadOnly() still unconditionally returns false with a TODO stating per-procedure classification is missing, so the engine cannot distinguish safe versus mutating CALL procedures. The PR diff confirms this hunk only added a comment and did not change the behavior. server/node/context_root_finalizer.go was changed in this PR to return rf.child.IsReadOnly(), but this pass-through does not fix or introduce the CALL defect because the child CALL node continues to report writable; the practical fix is to implement procedure-level mutability classification in Call.IsReadOnly() so mutating procedures are denied while genuinely read-safe procedures can be allowed in read_only mode.
Evidence Package

Tip

Reply with @itoqa to send us feedback on this test run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant