Skip to content

#114: pdo_pgsql 029 — drive a probing query in the pool-count poll loop#115

Merged
EdmondDantes merged 2 commits into
mainfrom
108-channel0-send-returns-without-a-waiting-receiver
May 10, 2026
Merged

#114: pdo_pgsql 029 — drive a probing query in the pool-count poll loop#115
EdmondDantes merged 2 commits into
mainfrom
108-channel0-send-returns-without-a-waiting-receiver

Conversation

@EdmondDantes
Copy link
Copy Markdown
Contributor

The race in 029-pdo_pgsql_pool_killed_concurrent.phpt: when coroA's exec("SELECT pg_sleep(5)") happens to acquire the other idle slot (not the one whose pid we killed), the killed slot stays in idle and nothing ever marks it broken — pool->count() reports 2 forever.

The driver-side fix (61e75f9 — pdo_pgsql_pool_before_acquire runs PQconsumeInput+PQstatus on every acquire from idle) is necessary but not sufficient on its own: the test polled count() in a loop that performed no pool operation, so the scrub never fired.

Adding a try { $pdo->query("SELECT 1")->fetch(); } per iteration drives an acquire→before_acquire→destroy sweep that evicts the orphaned slot.

Repro under -j4 --repeat 2 + opcache JIT + observer + 12x CPU-load: ~5% FAIL → 0/250 across the combined fix.

Also bumps CHANGELOG with a #114 entry.

The race in 029-pdo_pgsql_pool_killed_concurrent.phpt: when coroA's
exec("SELECT pg_sleep(5)") happens to acquire the *other* idle slot
(not the one whose pid we killed), the killed slot stays in idle and
nothing ever marks it broken — pool->count() reports 2 forever.

The driver-side fix (61e75f9 — pdo_pgsql_pool_before_acquire
runs PQconsumeInput+PQstatus on every acquire from idle) is necessary
but not sufficient on its own: the test polled count() in a loop that
performed *no* pool operation, so the scrub never fired.

Adding a try { $pdo->query("SELECT 1")->fetch(); } per iteration drives
an acquire→before_acquire→destroy sweep that evicts the orphaned slot.

Repro under -j4 --repeat 2 + opcache JIT + observer + 12x CPU-load:
~5% FAIL → 0/250 across the combined fix.

Also bumps CHANGELOG with a #114 entry.
…reads_connected

Under run-tests.php -jN with 16 workers and --repeat 2 the test reported
"potential leak (N extra connections)" ~50% of the time. Cause:
SHOW STATUS LIKE 'Threads_connected' is server-global — it counts every
client of the MySQL server, including the parallel test workers running
other pdo_mysql tests at the same time.

Replace the global delta with a process-local check: collect the
connection IDs the test created via SELECT CONNECTION_ID() inside each
coroutine, then poll information_schema.PROCESSLIST until those exact
IDs disappear (up to ~1s). A real leak surfaces as "still live" with
the actual IDs; parallel workers' connections are ignored because their
IDs aren't in our list.

Also drops the "final connections / connection difference" lines from
EXPECTF — the diff was only meaningful under the global counter.

Repro: -j$(nproc) --repeat 2 + opcache JIT under 12x CPU load
  before: 15/30 FAIL
  after:   0/30
@EdmondDantes EdmondDantes merged commit 6833f66 into main May 10, 2026
6 checks passed
@EdmondDantes EdmondDantes deleted the 108-channel0-send-returns-without-a-waiting-receiver branch May 10, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant