Skip to content

fix: add depth limit to EventSerializer to prevent hangs on complex objects#1577

Open
BKDDFS wants to merge 2 commits intolangfuse:mainfrom
BKDDFS:fix/serializer-depth-limit
Open

fix: add depth limit to EventSerializer to prevent hangs on complex objects#1577
BKDDFS wants to merge 2 commits intolangfuse:mainfrom
BKDDFS:fix/serializer-depth-limit

Conversation

@BKDDFS
Copy link

@BKDDFS BKDDFS commented Mar 26, 2026

Summary

EventSerializer.default() recursively traverses __dict__ and __slots__ of arbitrary objects without a depth limit. When @observe() captures function arguments containing objects like google.genai.Client (which hold aiohttp sessions, connection pools, and threading locks), json.dumps can block indefinitely.

Root cause

_serialize() in create_span_attributes calls json.dumps(obj, cls=EventSerializer). The serializer's __dict__ fallback recurses into every attribute of every object. For objects like aiohttp.ClientSessionTCPConnector → internal locks, this either hangs (lock contention / blocking attribute access) or enters extremely deep recursion.

Fix

Add a _MAX_DEPTH = 20 depth counter to EventSerializer:

  • default() increments/decrements _depth on each call
  • When _depth >= _MAX_DEPTH, returns <TypeName> placeholder instead of recursing
  • Serialization logic moved to _default_inner() — no behavior changes
  • encode() resets _depth at the start of each serialization

Workaround (current)

@observe(capture_input=False) — disables automatic input capture entirely.

Tests

3 new tests:

  • test_deeply_nested_object_does_not_hang — objects with connection pools/locks
  • test_max_depth_returns_type_name__dict__ deep nesting truncation
  • test_deeply_nested_slots_object_is_truncated__slots__ deep nesting truncation

All 21 tests pass (18 existing + 3 new).

Disclaimer: Experimental PR review

Greptile Summary

This PR fixes a real and reproducible hang in EventSerializer by introducing a _MAX_DEPTH = 20 counter that short-circuits recursion when serializing deeply nested objects (e.g. google.genai.Client holding aiohttp session internals). The core mechanism — incrementing _depth in default() and decrementing it in a finally block — is correct and well-structured.\n\nKey changes:\n- default() is now a thin depth-guard that delegates to _default_inner(), preserving all existing serialization behavior unchanged.\n- encode() resets _depth = 0 before each top-level call, which is a good defensive measure.\n- Three new tests cover the main scenarios: hang prevention, __dict__ depth truncation, and __slots__ depth truncation.\n\nIssues found:\n- The __slots__ branch at line 153–156 calls self.default({...}) for the intermediate dict it constructs. Because default() also increments _depth, each __slots__ object consumes 2 depth units per level instead of 1, cutting the effective limit from 20 to ~10 for __slots__-based objects. Changing that single call to _default_inner({...}) would make behavior consistent.\n- The test_deeply_nested_slots_object_is_truncated assertion bound (_MAX_DEPTH + 5 = 25) is too loose to catch regressions in truncation depth for the slots case.

Confidence Score: 4/5

Safe to merge — the hang is fixed and no existing behavior is regressed; the remaining issues are a minor depth inconsistency for __slots__ objects and a loose test assertion.

The fix correctly solves the stated problem (infinite recursion / hang on complex objects) and all 21 tests pass. The only issues are a double-depth-counting side-effect in the __slots__ path (effective limit ~10 instead of 20) and an overly permissive test assertion — neither affects the primary fix or causes correctness failures in normal usage.

langfuse/_utils/serializer.py lines 153–156 (__slots__ handling) and the corresponding test assertion at tests/test_serializer.py:267.

Important Files Changed

Filename Overview
langfuse/_utils/serializer.py Adds _MAX_DEPTH = 20 depth limiter to prevent infinite recursion/hangs; logic is correct for __dict__-based objects but __slots__ objects incur double depth consumption (consuming 2 units per level instead of 1), halving their effective limit to ~10.
tests/test_serializer.py Adds 3 new tests covering hang prevention, __dict__ truncation, and __slots__ truncation; the slots test has an overly loose assertion (_MAX_DEPTH + 5) that doesn't tightly verify the truncation depth.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["encode(obj)"] --> B["_depth = 0\nseen.clear()"]
    B --> C["default(obj)"]
    C --> D{{"_depth >= _MAX_DEPTH?"}}
    D -- Yes --> E["return '<TypeName>'"]
    D -- No --> F["_depth += 1\n_default_inner(obj)"]
    F --> G{{"Type of obj?"}}
    G -- "datetime / UUID / bytes / etc." --> H["Return primitive"]
    G -- "dict" --> I["self.default(k)\nself.default(v)\nfor each item"]
    G -- "list / Sequence" --> J["self.default(item)\nfor each item"]
    G -- "__slots__" --> K["self.default(dict of slots)\n⚠️ extra depth unit"]
    G -- "__dict__" --> L{{"id(obj) in seen?"}}
    L -- Yes --> M["return TypeName (circular ref)"]
    L -- No --> N["seen.add(id)\nself.default(v)\nfor each attr\nseen.remove(id)"]
    G -- "other" --> O["return '<TypeName>'"]
    I --> P["_depth -= 1\n(finally)"]
    J --> P
    K --> P
    N --> P
    H --> P
    M --> P
    O --> P
    P --> Q["return result to encode()"]
    Q --> R["super().encode(result)"]
Loading

Comments Outside Diff (1)

  1. langfuse/_utils/serializer.py, line 153-156 (link)

    P2 __slots__ path consumes double the depth budget per object level

    Because the __slots__ branch calls self.default({...}) for the intermediate dict, each __slots__ object consumes 2 depth units (one for the object itself, one for the constructed dict), whereas a __dict__-based object consumes only 1. In practice this halves the effective depth limit for __slots__ objects: they truncate at approximately 10 levels of nesting rather than the intended 20.

    Tracing for a chain of SlotLevel objects:

    • default(SlotLevel_N)_depth = 2*(N-1)
    • default({"child": SlotLevel_{N+1}})_depth = 2*(N-1)+1
    • Cutoff: 2*(N-1) ≥ 20 → N ≥ 11 (truncates at level 11, not 21)

    A lightweight fix is to call _default_inner directly for the constructed dict rather than routing it back through default():

    if hasattr(obj, "__slots__"):
        return self._default_inner(
            {slot: getattr(obj, slot, None) for slot in obj.__slots__}
        )

    This avoids the redundant depth check and gives __slots__ objects the same effective depth limit as __dict__ objects.

Reviews (1): Last reviewed commit: "fix: add depth limit to EventSerializer ..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

(2/5) Greptile learns from your feedback when you react with thumbs up/down!

Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@CLAassistant
Copy link

CLAassistant commented Mar 26, 2026

CLA assistant check
All committers have signed the CLA.

@BKDDFS BKDDFS force-pushed the fix/serializer-depth-limit branch from a895aed to 087052e Compare March 26, 2026 14:23
…bjects

EventSerializer.default() recursively traverses __dict__ and __slots__
of arbitrary objects without a depth limit. When @observe() captures
function arguments containing objects like google.genai.Client (which
hold aiohttp sessions, connection pools, and threading locks),
json.dumps blocks indefinitely on the second invocation.

Add a _MAX_DEPTH=20 counter that returns a <TypeName> placeholder
when exceeded, preventing infinite recursion into complex object graphs
while preserving all existing serialization behavior.
@BKDDFS BKDDFS force-pushed the fix/serializer-depth-limit branch from 087052e to 43114a6 Compare March 26, 2026 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants