Memory: configure and enforce unfreed memory threshold#43783
Memory: configure and enforce unfreed memory threshold#43783jronak wants to merge 1 commit intoenvoyproxy:mainfrom
Conversation
Signed-off-by: Ronak Jain <ronakjainc@gmail.com>
|
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to |
|
@nezdolik ping for "stalled" bit. |
adisuissa
left a comment
There was a problem hiding this comment.
Thanks!
Left an API comment.
| // Minimum unfreed memory bytes threshold that triggers memory release during admin handler. | ||
| // If the unfreed memory bytes is less than this value, no memory release will occur. | ||
| // If set to ``0``, the threshold is disabled and memory release will be attempted whenever unfreed | ||
| // memory is available (subject to allocator support). Defaults to 100 MB. |
There was a problem hiding this comment.
Question about the default value: why was it set to 100MB, and is this the actual default value that was used prior to this PR? If the current default value is different, then this is a semi-breaking-change, and should be clearly declared as such (in the release notes).
In general, it is preferred to keep the same default behavior (so Envoy users that upgrade their binaries don't see an unexpected change), unless the new default value makes more sense for all/most deployments.
There was a problem hiding this comment.
+1 It looks like this PR changes the prior behavior of heap shrinker (also the amount of bytes that will be released). Can we keep it aligned with past default behavior (when this config option is not enabled)?
There was a problem hiding this comment.
I re-checked the code this PR was based on, and the 100MB threshold was already the existing hardcoded behavior in the old shrink path, so that part of the change was intended to preserve the previous default.
I agree with @nezdolik’s point about the behavioral change on the tcmalloc path. The intent here was to keep the existing behavior backward compatible while moving the ownership/plumbing, but this refactor ended up switching the immediate shrink path away from the old fixed release behavior. That was not intentional.
|
|
||
| void AllocatorManager::releaseFreeMemory() { | ||
| #if defined(TCMALLOC) | ||
| tcmallocRelease(); |
There was a problem hiding this comment.
hm, with this new code path this will only release bytes_to_release_ amount of bytes, while before it would try to release maxUnfreedMemoryBytes().
| // Minimum unfreed memory bytes threshold that triggers memory release during admin handler. | ||
| // If the unfreed memory bytes is less than this value, no memory release will occur. | ||
| // If set to ``0``, the threshold is disabled and memory release will be attempted whenever unfreed | ||
| // memory is available (subject to allocator support). Defaults to 100 MB. |
There was a problem hiding this comment.
+1 It looks like this PR changes the prior behavior of heap shrinker (also the amount of bytes that will be released). Can we keep it aligned with past default behavior (when this config option is not enabled)?
|
could the latest heapshrinker achieve similar target? See #43321 |
|
/wait |
Given the overlap with the newer changes, I don’t see value in continuing this PR in its current shape. Right now we have multiple memory-release paths with slightly different ownership and semantics, which is part of why If maintainers think that direction still makes sense, I can cut a smaller follow-up focused only on that refactor. |
Commit Message: Currently xds and admin can trigger memory release from either of the allocators by invoking the utils static APIs. These APIs have static thresholds today, making it difficult to configure the limits using bootstrap configuration.
In production, our xds configs have large footprints causing memory churns over 800MB. Repeated churns cause tcmalloc central collector stalls due to large cleanup operations (especially gperftools), impacting worker threads waiting for memory from central free list due to thread cache exhaustion.
This PR introduces a configurable minimum_unfreed_memory_bytes_threshold to the MemoryAllocatorManager bootstrap configuration. Memory release is only triggered when unfreed memory exceeds this threshold, defaulting to 100MB. Setting it to 0 disables the threshold check.
This is the first PR in a stack, scoped to core threshold configuration and enforcement. Follow-up PRs will pass
MemoryAllocatorManagerthrough the subscription factory, gRPC mux, and xDS manager, and remove the remaining static Memory::Utils calls.Split from #41892 for easier review.
Additional Description: NA
Risk Level: Low
Testing: Unit test
Docs Changes: NA
Release Notes: NA
Platform Specific Features: NA