Experimental postgres query (PR 2/4): provisioned + JSON/CSV streaming + types#5136
Open
simonfaltum wants to merge 8 commits intosimonfaltum/postgres-query-pr1-scaffoldfrom
Open
Conversation
This is PR 2 of the experimental postgres query stack. Builds on PR 1's
scaffold to fill in the rest of the single-input output story.
Provisioned support: --target accepts both autoscaling resource paths
(starts with "projects/") and provisioned instance names (everything
else). Granular --project/--branch/--endpoint targeting stays
autoscaling-only. resolveProvisioned validates the instance is in the
AVAILABLE state and has read/write DNS before issuing a token.
Output renderers are now sinks fed by executeOne row-by-row instead
of buffering. textSink keeps buffering (tabwriter needs the widest
cell to align); jsonSink and csvSink stream. jsonSink uses
separator-before-element writing throughout so a mid-stream error can
close the array cleanly via OnError, leaving stdout as parseable JSON
with a partial result.
JSON value rendering follows the typed mapping: numbers stay numeric
inside +- 2^53, become strings outside; NaN/Inf become "NaN"/"Infinity"/
"-Infinity"; timestamps render in RFC3339; jsonb passes through as
json.RawMessage so e.g. {"id": 9007199254740993} keeps its digits;
bytea base64-encodes; everything else falls back to canonical Postgres
text. CSV and text use Postgres' canonical text representation, with
NULL rendered as the literal "NULL" in text and as empty in CSV
(matches psql --csv).
Output mode auto-selection mirrors aitools query: --output text on a
non-TTY stdout falls back to JSON. DATABRICKS_OUTPUT_FORMAT is honoured
when --output is not explicitly set; invalid env values are silently
ignored. Duplicate column names are deterministically renamed (id,
id__2, id__3) with a stderr warning.
Acceptance: argument-errors loses the now-obsolete "provisioned not
yet supported" case; new provisioned-targeting test exercises
not-AVAILABLE / no-DNS / 404 paths via the SDK testserver mock.
Co-authored-by: Isaac
4 tasks
- Fix non-finite float text: textValue had no float branch and fell
through to fmt.Sprintf, which emits +Inf/-Inf instead of Postgres'
Infinity/-Infinity. Added the explicit float case + tests.
- Emit JSON object keys in column order, not alphabetical. The map
approach inadvertently sorted keys; switched to manual ordered
emission (write '{', encode key:value pairs in column order, write
'}'). Added a regression test with non-alphabetical column names.
- Honor --output text on a pipe instead of silently rewriting to JSON.
Repo rule says "reject incompatible inputs early; never silently
ignore a flag the current mode can't honor". Auto-fallback now only
fires when the flag was not explicitly set (or not pinned by env).
- Trim impossible Go types from jsonValue (pgx never decodes int8 /
uint8/16/32 / uint64 from PG columns).
- Drop the redundant ReadWriteDns guard in resolveProvisioned; an
AVAILABLE Lakebase instance is documented to have DNS, and cmd/psql
doesn't carry the same guard.
- Build the unsupported-format error from allOutputFormats so the
message stays in sync if a fourth format is added.
- Update execute.go's QueryExecModeExec doc to acknowledge that we now
call rows.Values() (not RawValues), so all sinks see Go-typed input.
- Collapse empty rows-producing JSON to "[\n]\n" and matching OnError.
- Add stderr warning helper (commandTagRowCount now covered for
MERGE/COPY/FETCH/MOVE).
- Test gaps: text +Inf, text finite float, JSON column order, OnError
for csv/text sinks, CSV with embedded newline + quote.
Co-authored-by: Isaac
- Doc fix: textSink.OnError doc said "prints whatever rows have been collected" but text mode buffers everything to End. New doc states the buffered partial result is discarded on iteration error. - Doc fix: textValue float comment overstated psql parity. Tightened to acknowledge Go's 'g' format may differ from psql in exponential vs fixed notation around the boundary. - Tighten OnError contract: explicitly states it is NOT called when Begin itself errors. - Replace switch-by-format in isKnownOutputFormat with slices.Contains on allOutputFormats so adding a fourth format is one edit. - Tighten command-only JSON tests from JSONEq (key-order ignored) to byte-equal so a future field addition is caught. - Tighten JSONSink_OnError tests to byte-equal; add the Begin-but-no-rows case which exercises the rowsWritten==0 branch. Co-authored-by: Isaac
This was referenced Apr 30, 2026
…um/postgres-query-pr2-streaming # Conflicts: # acceptance/cmd/experimental/postgres/query/argument-errors/output.txt # acceptance/cmd/experimental/postgres/query/argument-errors/script
Both aitools query and postgres query had near-identical output-mode
selection: same DATABRICKS_OUTPUT_FORMAT env var, same flag-vs-env
precedence, same staticTableThreshold=30, same Format type with
text/json/csv values.
Promote the shared bits to experimental/libs/sqlcli:
- sqlcli.EnvOutputFormat, sqlcli.StaticTableThreshold consts
- sqlcli.Format typedef + sqlcli.OutputText/JSON/CSV consts
- sqlcli.AllFormats slice (canonical order for completions)
- sqlcli.ResolveFormat: handles flag > env > default precedence with
the explicit-text-on-pipe-is-honoured rule
Both consumers now import sqlcli. The package lives under
experimental/libs/ rather than libs/ so it inherits the experimental-
stability guarantee of its consumers; when both commands graduate, the
package can be promoted alongside.
The aitools migration is a pure refactor (no behavior change). The
postgres command's output.go and output_test.go are deleted; tests
moved to experimental/libs/sqlcli.
Co-authored-by: Isaac
Contributor
|
No findings. 🔍 Reviewed by nitpicker |
…sion, control-char escape Three P2 findings from the nitpicker bot, all in code introduced or strengthened in this PR: - stdoutTTY now uses cmdio.IsOutputTTY (a new tiny public helper that wraps the existing private isTTY) instead of cmdio.SupportsColor. SupportsColor folds in NO_COLOR / TERM=dumb, which are colour preferences and have nothing to do with whether stdout is a pipe; using it for the auto-fall-back-to-JSON decision silently demoted interactive text output to JSON for users with NO_COLOR set on a real terminal. IsOutputTTY is the right primitive for this. - jsonSink dup-column rename: the previous logic generated id__2 for the second `id` without checking whether id__2 was already taken by the original column list. A query returning ["id", "id__2", "id"] produced two id__2 keys. Now we keep bumping the suffix until unique. - textSink escapes \t, \n, \r in cell values before tabwriter sees them. tabwriter uses \t as a column boundary and \n as a row boundary, so an embedded tab silently shifted subsequent columns and an embedded newline split a logical row across multiple output lines. psql does the same backslash-letter escape. Co-authored-by: Isaac
arsenyinfo
approved these changes
Apr 30, 2026
…um/postgres-query-pr2-streaming
PR 1's lint fix dropped the entire provisioned.go because PR 1 had no caller. Re-add a slim version with just GetProvisioned and ProvisionedCredential (the two functions resolveProvisioned actually calls). Drop ListProvisionedInstances and AutoSelectProvisioned: they were originally intended for cmd/psql interactive selection, but the cmd/psql refactor was reverted, so they have no caller anywhere. Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Stack
experimental/libs/sqlclifor output handlingStacked on PR 1.
Why
Two things in this PR. The user-facing one: postgres query learns JSON/CSV streaming and provisioned-instance support. The architectural one: aitools query and postgres query had near-identical output-mode handling (same env var, same flag/env precedence, same threshold). Promote the duplication to a shared
experimental/libs/sqlclipackage before the second consumer ossifies the divergence.Changes
Architectural:
experimental/libs/sqlcli/is a new package underexperimental/libs/(notlibs/) so it inherits the experimental-stability guarantee of its consumers. Exposes:sqlcli.EnvOutputFormat,sqlcli.StaticTableThresholdconstants.sqlcli.Formattypedef +sqlcli.OutputText/JSON/CSVconsts +sqlcli.AllFormats.sqlcli.ResolveFormat— flag > env > default precedence with the explicit-text-on-pipe-is-honoured rule.aitools query migrates to use sqlcli (pure refactor, no behavior change). postgres query was about to add its own copy of all of this; instead it uses sqlcli from day one.
User-facing changes for postgres query:
--target my-instancenow resolves a provisioned instance.--output jsonstreams typed values: numbers stay numeric, jsonb stays nested, NaN/Inf/bigints-outside-2^53 become strings.--output csvstreams (no buffering).DATABRICKS_OUTPUT_FORMAThonoured.__Nsuffixes with a stderr warning.Also adds
cmdio.IsOutputTTY(a small public wrapper around the existing privateisTTY) so commands can ask "is stdout a terminal?" without folding inNO_COLOR/TERM=dumb(both of whichcmdio.SupportsColorANDs in for the colour-rendering decision).Test plan
go test ./experimental/aitools/... ./experimental/postgres/... ./experimental/libs/...(unit, sinks, value mapping, format selection, aitools tests still pass after migration)go tool ... golangci-lint run ./experimental/...(0 issues)