Skip to content

Experimental postgres query (PR 2/4): provisioned + JSON/CSV streaming + types#5136

Open
simonfaltum wants to merge 8 commits intosimonfaltum/postgres-query-pr1-scaffoldfrom
simonfaltum/postgres-query-pr2-streaming
Open

Experimental postgres query (PR 2/4): provisioned + JSON/CSV streaming + types#5136
simonfaltum wants to merge 8 commits intosimonfaltum/postgres-query-pr1-scaffoldfrom
simonfaltum/postgres-query-pr2-streaming

Conversation

@simonfaltum
Copy link
Copy Markdown
Member

@simonfaltum simonfaltum commented Apr 30, 2026

PR Stack

  1. #5135 — PR 1: scaffold + autoscaling targeting + text output
  2. PR 2 (this PR)#5136 — provisioned + JSON/CSV streaming + typed values + experimental/libs/sqlcli for output handling
  3. #5138 — PR 3: multi-input + multi-statement rejection + error formatting
  4. #5143 — PR 4: cancellation + timeout + TUI

Stacked on PR 1.

Why

Two things in this PR. The user-facing one: postgres query learns JSON/CSV streaming and provisioned-instance support. The architectural one: aitools query and postgres query had near-identical output-mode handling (same env var, same flag/env precedence, same threshold). Promote the duplication to a shared experimental/libs/sqlcli package before the second consumer ossifies the divergence.

Changes

Architectural: experimental/libs/sqlcli/ is a new package under experimental/libs/ (not libs/) so it inherits the experimental-stability guarantee of its consumers. Exposes:

  • sqlcli.EnvOutputFormat, sqlcli.StaticTableThreshold constants.
  • sqlcli.Format typedef + sqlcli.OutputText/JSON/CSV consts + sqlcli.AllFormats.
  • sqlcli.ResolveFormat — flag > env > default precedence with the explicit-text-on-pipe-is-honoured rule.

aitools query migrates to use sqlcli (pure refactor, no behavior change). postgres query was about to add its own copy of all of this; instead it uses sqlcli from day one.

User-facing changes for postgres query:

  • --target my-instance now resolves a provisioned instance.
  • --output json streams typed values: numbers stay numeric, jsonb stays nested, NaN/Inf/bigints-outside-2^53 become strings.
  • --output csv streams (no buffering).
  • DATABRICKS_OUTPUT_FORMAT honoured.
  • Auto-fallback to JSON when stdout is piped.
  • Duplicate column names get deterministic __N suffixes with a stderr warning.

Also adds cmdio.IsOutputTTY (a small public wrapper around the existing private isTTY) so commands can ask "is stdout a terminal?" without folding in NO_COLOR / TERM=dumb (both of which cmdio.SupportsColor ANDs in for the colour-rendering decision).

Test plan

  • go test ./experimental/aitools/... ./experimental/postgres/... ./experimental/libs/... (unit, sinks, value mapping, format selection, aitools tests still pass after migration)
  • go tool ... golangci-lint run ./experimental/... (0 issues)

This is PR 2 of the experimental postgres query stack. Builds on PR 1's
scaffold to fill in the rest of the single-input output story.

Provisioned support: --target accepts both autoscaling resource paths
(starts with "projects/") and provisioned instance names (everything
else). Granular --project/--branch/--endpoint targeting stays
autoscaling-only. resolveProvisioned validates the instance is in the
AVAILABLE state and has read/write DNS before issuing a token.

Output renderers are now sinks fed by executeOne row-by-row instead
of buffering. textSink keeps buffering (tabwriter needs the widest
cell to align); jsonSink and csvSink stream. jsonSink uses
separator-before-element writing throughout so a mid-stream error can
close the array cleanly via OnError, leaving stdout as parseable JSON
with a partial result.

JSON value rendering follows the typed mapping: numbers stay numeric
inside +- 2^53, become strings outside; NaN/Inf become "NaN"/"Infinity"/
"-Infinity"; timestamps render in RFC3339; jsonb passes through as
json.RawMessage so e.g. {"id": 9007199254740993} keeps its digits;
bytea base64-encodes; everything else falls back to canonical Postgres
text. CSV and text use Postgres' canonical text representation, with
NULL rendered as the literal "NULL" in text and as empty in CSV
(matches psql --csv).

Output mode auto-selection mirrors aitools query: --output text on a
non-TTY stdout falls back to JSON. DATABRICKS_OUTPUT_FORMAT is honoured
when --output is not explicitly set; invalid env values are silently
ignored. Duplicate column names are deterministically renamed (id,
id__2, id__3) with a stderr warning.

Acceptance: argument-errors loses the now-obsolete "provisioned not
yet supported" case; new provisioned-targeting test exercises
not-AVAILABLE / no-DNS / 404 paths via the SDK testserver mock.

Co-authored-by: Isaac
- Fix non-finite float text: textValue had no float branch and fell
  through to fmt.Sprintf, which emits +Inf/-Inf instead of Postgres'
  Infinity/-Infinity. Added the explicit float case + tests.
- Emit JSON object keys in column order, not alphabetical. The map
  approach inadvertently sorted keys; switched to manual ordered
  emission (write '{', encode key:value pairs in column order, write
  '}'). Added a regression test with non-alphabetical column names.
- Honor --output text on a pipe instead of silently rewriting to JSON.
  Repo rule says "reject incompatible inputs early; never silently
  ignore a flag the current mode can't honor". Auto-fallback now only
  fires when the flag was not explicitly set (or not pinned by env).
- Trim impossible Go types from jsonValue (pgx never decodes int8 /
  uint8/16/32 / uint64 from PG columns).
- Drop the redundant ReadWriteDns guard in resolveProvisioned; an
  AVAILABLE Lakebase instance is documented to have DNS, and cmd/psql
  doesn't carry the same guard.
- Build the unsupported-format error from allOutputFormats so the
  message stays in sync if a fourth format is added.
- Update execute.go's QueryExecModeExec doc to acknowledge that we now
  call rows.Values() (not RawValues), so all sinks see Go-typed input.
- Collapse empty rows-producing JSON to "[\n]\n" and matching OnError.
- Add stderr warning helper (commandTagRowCount now covered for
  MERGE/COPY/FETCH/MOVE).
- Test gaps: text +Inf, text finite float, JSON column order, OnError
  for csv/text sinks, CSV with embedded newline + quote.

Co-authored-by: Isaac
- Doc fix: textSink.OnError doc said "prints whatever rows have been
  collected" but text mode buffers everything to End. New doc states
  the buffered partial result is discarded on iteration error.
- Doc fix: textValue float comment overstated psql parity. Tightened
  to acknowledge Go's 'g' format may differ from psql in exponential
  vs fixed notation around the boundary.
- Tighten OnError contract: explicitly states it is NOT called when
  Begin itself errors.
- Replace switch-by-format in isKnownOutputFormat with slices.Contains
  on allOutputFormats so adding a fourth format is one edit.
- Tighten command-only JSON tests from JSONEq (key-order ignored) to
  byte-equal so a future field addition is caught.
- Tighten JSONSink_OnError tests to byte-equal; add the
  Begin-but-no-rows case which exercises the rowsWritten==0 branch.

Co-authored-by: Isaac
…um/postgres-query-pr2-streaming

# Conflicts:
#	acceptance/cmd/experimental/postgres/query/argument-errors/output.txt
#	acceptance/cmd/experimental/postgres/query/argument-errors/script
Both aitools query and postgres query had near-identical output-mode
selection: same DATABRICKS_OUTPUT_FORMAT env var, same flag-vs-env
precedence, same staticTableThreshold=30, same Format type with
text/json/csv values.

Promote the shared bits to experimental/libs/sqlcli:

  - sqlcli.EnvOutputFormat, sqlcli.StaticTableThreshold consts
  - sqlcli.Format typedef + sqlcli.OutputText/JSON/CSV consts
  - sqlcli.AllFormats slice (canonical order for completions)
  - sqlcli.ResolveFormat: handles flag > env > default precedence with
    the explicit-text-on-pipe-is-honoured rule

Both consumers now import sqlcli. The package lives under
experimental/libs/ rather than libs/ so it inherits the experimental-
stability guarantee of its consumers; when both commands graduate, the
package can be promoted alongside.

The aitools migration is a pure refactor (no behavior change). The
postgres command's output.go and output_test.go are deleted; tests
moved to experimental/libs/sqlcli.

Co-authored-by: Isaac
@arsenyinfo
Copy link
Copy Markdown
Contributor

No findings.


🔍 Reviewed by nitpicker

…sion, control-char escape

Three P2 findings from the nitpicker bot, all in code introduced or
strengthened in this PR:

- stdoutTTY now uses cmdio.IsOutputTTY (a new tiny public helper that
  wraps the existing private isTTY) instead of cmdio.SupportsColor.
  SupportsColor folds in NO_COLOR / TERM=dumb, which are colour
  preferences and have nothing to do with whether stdout is a pipe;
  using it for the auto-fall-back-to-JSON decision silently demoted
  interactive text output to JSON for users with NO_COLOR set on a
  real terminal. IsOutputTTY is the right primitive for this.
- jsonSink dup-column rename: the previous logic generated id__2 for
  the second `id` without checking whether id__2 was already taken by
  the original column list. A query returning ["id", "id__2", "id"]
  produced two id__2 keys. Now we keep bumping the suffix until unique.
- textSink escapes \t, \n, \r in cell values before tabwriter sees
  them. tabwriter uses \t as a column boundary and \n as a row
  boundary, so an embedded tab silently shifted subsequent columns and
  an embedded newline split a logical row across multiple output lines.
  psql does the same backslash-letter escape.

Co-authored-by: Isaac
PR 1's lint fix dropped the entire provisioned.go because PR 1 had no
caller. Re-add a slim version with just GetProvisioned and
ProvisionedCredential (the two functions resolveProvisioned actually
calls). Drop ListProvisionedInstances and AutoSelectProvisioned: they
were originally intended for cmd/psql interactive selection, but the
cmd/psql refactor was reverted, so they have no caller anywhere.

Co-authored-by: Isaac
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 1, 2026 07:02 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 1, 2026 07:02 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants