Improve concurrent request limit error messaging and documentation#3842
Improve concurrent request limit error messaging and documentation#3842KyleAMathews wants to merge 6 commits intomainfrom
Conversation
…ntrol limits Root cause: 200 clients × 13 collections = 2,600 concurrent long-poll connections, exceeding the default 1,000 existing-request admission control limit. The Vary: Authorization header in their proxy also defeats CDN request collapsing. https://claude.ai/code/session_01ASiu2KpoheTmvkpJi1Di7c
- Change error code from "overloaded" to "concurrent_request_limit_exceeded" with a message that includes the actual limit, the request kind (initial vs existing), and pointers to ELECTRIC_MAX_CONCURRENT_REQUESTS and the troubleshooting docs - Add troubleshooting section for 503 concurrent request limit exceeded - Document ELECTRIC_MAX_CONCURRENT_REQUESTS and ELECTRIC_LONG_POLL_TIMEOUT in the config reference - Remove investigation file (analysis folded into docs) https://claude.ai/code/session_01ASiu2KpoheTmvkpJi1Di7c
✅ Deploy Preview for electric-next ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3842 +/- ##
===========================================
+ Coverage 75.75% 87.07% +11.32%
===========================================
Files 11 25 +14
Lines 693 2314 +1621
Branches 174 583 +409
===========================================
+ Hits 525 2015 +1490
- Misses 167 297 +130
- Partials 1 2 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This comment has been minimized.
This comment has been minimized.
The error code and limit value are enough for an engineer to find the docs. The env var name and URL are noise for end-user clients. https://claude.ai/code/session_01ASiu2KpoheTmvkpJi1Di7c
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
website/docs/api/config.md
Outdated
|
|
||
| When the limit is exceeded, Electric responds with `503` and a `Retry-After` header. See the [troubleshooting guide](/docs/guides/troubleshooting#_503-mdash-concurrent-request-limit-exceeded) for details. | ||
|
|
||
| Each `live=true` long-poll request holds a permit for up to 20 seconds, so the effective limit is determined by the number of concurrent shape subscriptions across all connected clients. Putting a [CDN with request collapsing](/docs/api/http#collapsing-live-requests) in front of Electric is the recommended way to handle high connection counts. |
There was a problem hiding this comment.
long-poll request holds a permit for up to 20 seconds
This phrase is mentioned twice in this PR. "Permit" looks out of place here. The following sounds more natural, IMO
long-poll request holds the connection open for up to 20 seconds.
| ELECTRIC_MAX_CONCURRENT_REQUESTS='{"initial": 500, "existing": 3000}' | ||
| ``` | ||
|
|
||
| Live long-poll connections are lightweight Erlang processes, so most hardware can handle higher limits. The default of 1,000 is conservative. |
There was a problem hiding this comment.
Could be a discussion for a different PR but this documentation doesn't explain why the limit is conservative. The rest of the narrative mentions things like "your cpu and mem could be fine" and "live connections are lightweight process". So why artificially limit the concurrency to 1000 then?
I think it should be 10k minimum.
-
Partly because there's no logging on the server side when the limit is reached. The developer will only learn about it being hit if they have proper telemetry/alerting in place in their proxy between clients and Electric.
-
Partly because having a CDN in front doesn't guarantee that Electric will be hit by fewer connections, when request collapsing cannot be applied to to different shape params.
-
And partly because 10k simultaneous connections used to be a challenge, but at this point it should be table stakes even on modest HW.
The term C10k was coined in 1999 by software engineer Dan Kegel, [...] serving 10,000 clients at once over 1 gigabit per second in that year.
As of 2025, the problem has long since been solved, with the number of possible connections to a single computer being in the millions.
Address review feedback: increase the default concurrent existing request limit from 1,000 to 10,000 and replace "holds a permit" with "holds the connection open" in user-facing docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@alco suggestions applied! |
Improves the 503 error response when Electric's concurrent request limit is exceeded, replacing a generic "overloaded" message with specific, actionable information. Adds documentation for the
ELECTRIC_MAX_CONCURRENT_REQUESTSconfig option and a troubleshooting guide.Root Cause
When too many clients connect simultaneously and exhaust the admission control permits, Electric returns a 503 with
{"code": "overloaded", "message": "Server is currently overloaded, please retry"}. This tells the operator nothing about which limit was hit, what the limit is, or how to fix it — leading to unnecessary debugging.Approach
Error message improvement: The error code is now
"concurrent_request_limit_exceeded"and the message includes the request kind (initialorexisting) and the configured limit value. This lets operators immediately understand which pool is saturated and what to change.Documentation: Added two new sections:
ELECTRIC_MAX_CONCURRENT_REQUESTSexplaining the two-pool model (initial vs existing requests)Key invariant: The error message format must match what the documentation shows. Review caught a mismatch from an intermediate commit and corrected it.
Verification
The test verifies the new error code and that the message includes
"Concurrent existing request limit exceeded".Files changed
packages/sync-service/lib/electric/plug/serve_shape_plug.ex— Changed error code and message to include request kind and limitpackages/sync-service/test/electric/plug/router_test.exs— Updated assertions for new error formatwebsite/docs/api/config.md— AddedELECTRIC_MAX_CONCURRENT_REQUESTSdocumentationwebsite/docs/guides/troubleshooting.md— Added 503 concurrent request limit troubleshooting section🤖 Generated with Claude Code