feat(voice): surface unrecoverable error message with a spoken fallback by chenghao-mou · Pull Request #6152 · livekit/agents

chenghao-mou · 2026-06-18T13:45:17Z

An unrecoverable error (e.g. out of LiveKit Inference credits) used to leave the agent silently connected but mute. Now AgentSession:

Closes on the first terminal error instead of waiting out max_unrecoverable_errors identical failures.
Speaks unrecoverable_error_message before closing, so the agent isn't silent — a str (synthesized via TTS, or played as-is if it's a file path), an AudioSource / BuiltinAudioClip (played directly, so it survives a dead TTS), or None to disable. Defaults to a generic message on terminal errors.

Inference 429 quota errors are now a typed InferenceQuotaExceededError (livekit.agents.inference) you can catch from the error event, carrying the gateway's quota_type / category / hint. Quota details are never spoken to end users.

Supersedes #6012 (consolidated; original author credited as co-author). Closes #6009.

🤖 Generated with Claude Code

…en fallback An unrecoverable error (e.g. depleted LiveKit Inference credits) used to leave the agent silent: it would join, publish its track, and never speak. AgentSession now surfaces a terminal error on the first occurrence instead of absorbing it under max_unrecoverable_errors, and speaks a configurable `unrecoverable_error_message` (or a generic default) before closing. Inference 429 quota bodies are parsed into a typed InferenceQuotaExceededError (in livekit.agents.inference) carrying the gateway quota_type/category/hint, with retryable/terminal derived from the category. The general `terminal` flag lives on APIError; the inference-specific quota schema stays in the inference plugin. Closes #6009 Co-authored-by: Shawn Feldman <shawn.feldman@livekit.io> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…cision The session's _on_error was registered as an `error` listener in __init__, so it ran before any user-registered handler and made the close/tolerate decision before a handler could set `ev.error.recoverable = True`. Drop the self-listener and have AgentActivity emit the event (user handlers run) and then call _on_error, restoring the original ordering. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

shawnfeldman

PR Review Summary

Reviewed with the standard agent panel (code-reviewer, silent-failure-hunter, pr-test-analyzer, type-design-analyzer, comment-analyzer, code-simplifier). All findings below were verified against the full post-change files.

Verdict: solid, no blockers. The PR does exactly what it claims — a terminal error now closes on the first occurrence and speaks a fallback before teardown, with the error/close events firing on every path (verified: TTS-failure, no-audio-output, and realtime paths all still emit close). The try/except/finally in _close_on_error is best-effort-before-teardown done right (CancelledError is not swallowed since it's BaseException; the primary error is already logged at error level before the close task is created). Test coverage is strong (33 hermetic tests across both the exception layer and the session layer). No Critical issues.

Important (worth addressing, non-blocking)

terminal is invisible in error reprs. APIError.__repr__ (livekit-agents/livekit/agents/_exceptions.py:57) and APIStatusError.__str__/__repr__ (:95, :107) print retryable but omit the new terminal field — which now decides whether a session dies on turn 1. When someone debugs "why did my session close immediately," repr(err) won't show terminal=True. Fix: add terminal={self.terminal!r} to both reprs and the __str__.
Test gap: the 16s playout timeout is never exercised (agent_session.py:1615). wait_for_playout() is await asyncio.shield(self._done_fut) (voice/speech_handle.py:182), so the asyncio.wait_for(..., timeout=_ERROR_MESSAGE_PLAYOUT_TIMEOUT) cancels only the outer wait; teardown's interrupt(force=True) + drain() in _aclose_impl is what actually reclaims a stuck playout. This is load-bearing safety (a wedged playout must not strand teardown) and is untested. Suggest a test where the audio sink never signals playback-finished, asserting close still completes within a bound. Worth a one-line comment at :1615 noting the timeout is a soft bound, not a hard cap on playout.
Test gap: no "second unrecoverable error arrives while _closing_task is in flight." The if self._closing_task or error.recoverable: return guard (agent_session.py:1550) is the only thing preventing a double-close / double say() during the speak-then-close window (realistic for a depleted gateway firing back-to-back errors). Suggest firing _on_error twice before awaiting and asserting exactly one close event.

Suggestions (nits / polish)

AudioSource's str arm is interpreted differently by its two consumers. AgentSession resolves a str as file-path-if-os.path.isfile-else-TTS-text (agent_session.py:1606), but BackgroundAudioPlayer._normalize_builtin_audio (voice/background_audio.py:207) treats str as a file path only (no TTS). Same type alias (voice/audio_source.py:30), opposite outcomes for "goodbye". code-simplifier confirmed a shared resolver isn't worth it (the contracts genuinely differ), so the cheap fix is a one-line note on the AudioSource alias documenting the divergence.
terminal ⇒ not retryable is enforced only by docstring. APIError.__init__ (_exceptions.py:39) accepts both flags independently, so APIStatusError("x", status_code=429, terminal=True) yields retryable=True, terminal=True (429 is transient). InferenceQuotaExceededError derives both from one boolean and can't self-contradict, but the base explicit-override path can. Consider deriving self.retryable = retryable and not terminal, or asserting in __init__.
Empty-string error message is a silent no-op. unrecoverable_error_message="" is "given" (not NOT_GIVEN/None), so _resolve_error_message (agent_session.py:1581) returns "", which isn't a file → say("", audio=NOT_GIVEN) speaks nothing and suppresses the default. Unlikely, but a one-line doc note (or treating "" like None) would avoid the surprise.
inference/tts.py:739 (the _run/streaming-response path) has the same bodyless create_api_error_from_http as :496 but lacks the explanatory "aiohttp discards the body / typing is future work" comment that :496 carries. Add a back-reference so a future maintainer knows the omission is deliberate.
Minor doc redundancy: the unrecoverable_error_message docstring (agent_session.py:~481) says "str ... or an AudioSource," but AudioSource already includes str. The split is pedagogically useful (str has two sub-behaviors) — optionally clarify with "(str is itself an AudioSource; called out because its behavior depends on whether it's a file path)."
Optional refactor (code-simplifier): the 429 re-parse in inference/llm.py:466 reassigns body = e.body on the except path (the value it already held). Extracting a tiny _response_body(e) helper removes that ambiguity. Behavior-identical; take it or leave it.

Not introduced by this PR (context, no action)

STT/realtime errors are not counted against max_unrecoverable_errors — there's no _stt_error_counts, and _on_error only branches on "llm_error"/"tts_error" (agent_session.py:1556). A non-terminal unrecoverable STT/realtime error closes on first occurrence. This matches the base code; it's orthogonal to this PR. One downstream effect worth being aware of: STT/TTS quota 429s stay untyped (the handshake body is discarded by aiohttp, per the comments at inference/stt.py:884/tts.py:496), so the new "out of credits" spoken-fallback UX is effectively LLM/TTS-only today.

Strengths

_str_or_none coercion of untrusted gateway JSON (inference/_exceptions.py:13) is exemplary — it coerces non-str values to None before the category in _TERMINAL_QUOTA_CATEGORIES frozenset check, so an unhashable category: [...] can't crash membership, and malformed fields can't leak. Pinned by test_quota_error_non_str_body_fields_are_dropped.
The gateway hint is never spoken — terminal errors always fall back to the generic DEFAULT_UNRECOVERABLE_ERROR_MESSAGE. Good privacy hygiene; tested over both a real hint and None.
Terminal-vs-transient split is tested at both layers (exception construction and session behavior), and await_count assertions prove credit-exhaustion hits the endpoint exactly once while rate-limits retry.
Clean type extraction: moving BuiltinAudioClip/AudioSource into voice/audio_source.py breaks the import cycle with no duplicate atexit/ExitStack registration, and the removed background_audio.py imports are genuinely unused.
InferenceQuotaExceededError derives terminal/retryable from a single boolean (can't self-contradict via the body path), with an appropriate Optional-returning from_response factory that keeps create_api_error_from_http provider-agnostic.
Comments are accurate — the comment-analyzer found zero factually-incorrect comments; the ev.error.error attribute path, the terminal-category list, and the emit-before-decide ordering all cross-check against the code.
Backward compatible: terminal defaults to False (existing errors unaffected) and the new params are additive keyword-only.

Reviewed by Claude Code.

devin-ai-integration

Devin Review found 1 new potential issue.

devin-ai-integration · 2026-06-19T15:12:24Z

+    async def _close_on_error(
+        self, error: llm.LLMError | stt.STTError | tts.TTSError | llm.RealtimeModelError
+    ) -> None:
+        # best-effort: speak a fallback message before teardown so the agent isn't silent.
+        # A failing or unavailable source just falls through to close — the error/close
+        # events fire either way.
+        try:
+            message = self._resolve_error_message(error.error)
+            if message is None or self._activity is None or self.output.audio is None:
+                return
+
+            # pre-recorded audio is played as-is, so it survives a dead/exhausted TTS; a
+            # non-file str falls back to TTS synthesis.
+            text = ""
+            audio: NotGivenOr[AsyncIterable[rtc.AudioFrame]] = NOT_GIVEN
+            if isinstance(message, BuiltinAudioClip):
+                audio = audio_frames_from_file(message.path())
+            elif isinstance(message, str):
+                if os.path.isfile(message):
+                    audio = audio_frames_from_file(message)
+                else:
+                    text = message
+            else:
+                audio = message
+
+            handle = self.say(text, audio=audio, allow_interruptions=False, add_to_chat_ctx=False)
+            await asyncio.wait_for(
+                handle.wait_for_playout(), timeout=_ERROR_MESSAGE_PLAYOUT_TIMEOUT
+            )
+        except Exception:
+            logger.warning("failed to play the unrecoverable-error message", exc_info=True)
+        finally:
+            await self._aclose_impl(error=error, reason=CloseReason.ERROR)


🚩 Extended window where _closing is False during error-message playout

In the old code, _on_error directly created _aclose_impl as a task, which set self._closing = True almost immediately (first thing after acquiring the lock at agent_session.py:991). In the new code, _close_on_error first speaks the error message (up to 16s timeout at agent_session.py:1615-1616) before calling _aclose_impl in the finally block. During this window, self._closing remains False. Code that gates on self._closing (e.g. _update_activity at line 1480: if self._closing and new_activity == "start") won't know a close is pending. The _closing_task field IS set, which guards re-entrant _on_error calls, but doesn't guard activity updates. In practice this is unlikely to cause issues because (a) activity updates require async scheduling and the error occurs in a failing component, (b) _aclose_impl still runs in the finally and cleans everything up, and (c) the _started flag prevents double-close. But it's worth being aware of the extended window.

Was this helpful? React with 👍 or 👎 to provide feedback.

chenghao-mou requested a review from a team as a code owner June 18, 2026 13:45

This comment was marked as resolved.

Sign in to view

chenghao-mou force-pushed the cm/agents-issue-6009 branch 2 times, most recently from 4125ee7 to 4be6763 Compare June 18, 2026 14:08

chenghao-mou force-pushed the cm/agents-issue-6009 branch from 4be6763 to 5b029dd Compare June 18, 2026 14:16

chenghao-mou changed the title ~~feat(voice): surface unrecoverable inference quota errors with a spoken fallback~~ feat(voice): surface unrecoverable error message with a spoken fallback Jun 18, 2026

chenghao-mou requested a review from shawnfeldman June 18, 2026 15:15

shawnfeldman reviewed Jun 18, 2026

View reviewed changes

minor refactor and address comments

b494f42

devin-ai-integration Bot reviewed Jun 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(voice): surface unrecoverable error message with a spoken fallback#6152

feat(voice): surface unrecoverable error message with a spoken fallback#6152
chenghao-mou wants to merge 3 commits into
mainfrom
cm/agents-issue-6009

chenghao-mou commented Jun 18, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

shawnfeldman left a comment

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chenghao-mou commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

shawnfeldman left a comment

Choose a reason for hiding this comment

PR Review Summary

Important (worth addressing, non-blocking)

Suggestions (nits / polish)

Not introduced by this PR (context, no action)

Strengths

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chenghao-mou commented Jun 18, 2026 •

edited

Loading