Skip to content

Ensure exceptions are logged through standard logging mechanism#645

Open
RBrid wants to merge 6 commits into
openclaw:masterfrom
RBrid:user/rbrid/ExceptionsLogging1
Open

Ensure exceptions are logged through standard logging mechanism#645
RBrid wants to merge 6 commits into
openclaw:masterfrom
RBrid:user/rbrid/ExceptionsLogging1

Conversation

@RBrid
Copy link
Copy Markdown
Contributor

@RBrid RBrid commented Jun 2, 2026

Audit and fix silent exception swallows across the OpenClaw Windows Companion app, and sanitize sensitive details from tray logs. Bare catch (Exception) { } and similar patterns now log via IOpenClawLogger, OpenClawTray Services.Logger, or System.Diagnostics.Trace (for self-referential/static contexts) before swallowing or rethrowing. Log messages include type-name prefixes (e.g. DataModelStore:, SettingsPage:, ChatWindow:) for greppability.

Consolidates two related efforts:

  • This PR (RBrid: exception logging audit + log-prefix conventions).
  • Log previously swallowed exceptions #671 by Christine Yan (tray log sanitization). Merged in as commit 0429f49a with co-authorship preserved.

Exception logging

Patterns applied:

  • Typed catches with appropriate severity (Error for real bugs, Warn for user-visible degradation, Debug for routine teardown/shutdown/cleanup/dispatcher breadcrumbs).
  • Atomic-write catches (McpAuthToken, DeviceIdentity) capture the original exception, Trace, attempt cleanup with its own typed+Traced catch, then rethrow.
  • OCE/ODE during shutdown narrowed to Debug to avoid noise.
  • Logger.cs self-referential catches use Trace.WriteLine to avoid recursion.
  • App.xaml.cs OnProcessExit/LogCrash use belt-and-suspenders Trace + Console.Error with documented nothing-left-to-call innermost guards.
  • Type-name log prefixes added/restored across audited sites for consistency with existing conventions.

Log sanitization (from #671)

Centralized in OpenClaw.Shared.TokenSanitizer with a mirror in CommandCenterTextHelper for tray support context:

  • Bearer tokens, API keys, and credential-shaped substrings redacted.
  • IPv4 and IPv6 addresses (including bracketed form and RFC 6874 zone identifiers like eth0.1, br-1234, wlan_0) redacted to <ipv6> / <ipv4>.
  • Regex evaluation is timeout-guarded (1s); on timeout, output is replaced with a public SanitizerTimeoutSentinel constant so callers fail safe rather than leak.
  • Outer try/catch on all sanitizer entry points; sanitizer never throws.
  • IpV6Pattern / RedactIfValidIpV6 exposed via InternalsVisibleTo so the tray helper reuses the same canonical regex.

Review

Iterated through six rounds of dual-model adversarial code review (Opus + Codex via the Hanselman skill). All HIGH-consensus findings addressed; final round converged on a single LOW finding both reviewers explicitly recommended accepting as the necessary trade-off for VLAN-style zone identifiers.

Validation

  • dotnet test OpenClaw.Shared.Tests — 2083 passed / 29 skipped (3 new sanitizer theory rows for interface name varieties).
  • dotnet test OpenClaw.Tray.Tests — 938 passed.
  • TokenSanitizer suite — 68 passed.

christineyan4 and others added 2 commits June 2, 2026 17:37
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Audit and fix silent exception swallows across the OpenClaw Windows Companion app. Bare `catch (Exception) { }` and similar patterns now log via IOpenClawLogger, OpenClawTray Services.Logger, or System.Diagnostics.Trace (for self-referential/static contexts) before swallowing or rethrowing.

Patterns applied:

- Typed catches with appropriate severity (Error for real bugs, Warn for user-visible degradation, Debug for routine teardown/shutdown/cleanup/dispatcher breadcrumbs).

- Atomic-write catches (McpAuthToken, DeviceIdentity) capture the original exception, Trace, attempt cleanup with its own typed+Traced catch, then rethrow.

- OCE/ODE during shutdown narrowed to Debug to avoid noise.

- Logger.cs self-referential catches use Trace.WriteLine to avoid recursion.

- App.xaml.cs OnProcessExit/LogCrash use belt-and-suspenders Trace + Console.Error with documented `nothing-left-to-call` innermost guards.

Validated with per-project dotnet build (Tray.WinUI x64 clean), Shared.Tests (2045 passed / 29 skipped), Tray.Tests (934 passed).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented Jun 2, 2026

Codex review: needs real behavior proof before merge. Reviewed June 4, 2026, 7:18 PM ET / 23:18 UTC.

Summary
The PR adds Logger/Trace diagnostics around swallowed exceptions across connection, setup, shared, MCP/MXC, tray, chat, voice, and window code while extending tray log sanitization and sanitizer tests.

Reproducibility: yes. for the blocking findings from source inspection: current main logs failed setup steps at Error, while PR head makes non-exception failures Warn-only; the IPv6 and debug-bundle paths are also source-reproducible. I did not run validation because this is a read-only review.

Review metrics: 3 noteworthy metrics.

  • Diff surface: 57 files, +1209/-244. The broad runtime logging and sanitizer surface raises the bar for representative proof and validation before merge.
  • Setup severity hunk: 1 Error-to-Warn change. This is the reviewed hunk that changes existing diagnostic severity rather than only adding missing logging.
  • Required validation reported: 2 test commands, 0 build command in this PR body. AGENTS.md requires ./build.ps1 plus shared and tray tests before completion.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🦪 silver shellfish
Patch quality: 🧂 unranked krab
Result: blocked until stronger real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add redacted final-head runtime proof showing a representative formerly swallowed exception reaching Logger or Trace.
  • Restore Error-level setup failure visibility for non-exception StepResult.Fail outcomes.
  • [P1] Sanitize debug-bundle read-failure exception text and fix the bracketed IPv6 redaction/test mismatch.

Proof guidance:

  • [P1] Needs stronger real behavior proof before merge: The linked related PR contains copied gateway log output, but this combined final head still needs redacted runtime log, terminal output, recording, or similar proof for a representative formerly swallowed exception; redact IPs, API keys, phone numbers, non-public endpoints, and other private details before updating the PR body or asking for @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

  • [P1] Non-exception setup failures would lose their existing Error-level summary and become Warn-only, which makes failed setup runs less visible in logs and console output.
  • [P1] The debug-bundle log-read failure path can leak unsanitized exception messages even though this PR is meant to sanitize copied support diagnostics.
  • [P1] The IPv6 sanitizer helper appears inconsistent with its own added bracketed-address test, so the shared test suite should fail or the expected URI shape is wrong.
  • [P1] Related proof from Log previously swallowed exceptions #671 covers a gateway smoke path, but this PR still lacks final-head proof for the broader exception logging audit.
  • [P1] The PR body reports shared/tray tests but not the full AGENTS.md validation command set, including ./build.ps1.

Maintainer options:

  1. Repair privacy and severity blockers (recommended)
    Fix the unsanitized debug-bundle exception text, preserve Error-level setup failure summaries, and align bracketed IPv6 redaction with the added tests before merge.
  2. Accept the severity downgrade explicitly
    Maintainers could choose Warn-only summaries for non-exception setup failures, but that should be an explicit logging contract decision with updated coverage.
  3. Pause until final-head proof exists
    Keep the PR open but do not merge until the contributor adds redacted runtime proof for a representative swallowed exception on this combined branch.

Next step before merge

  • [P1] Contributor-supplied final-head runtime proof and code repairs are needed before normal maintainer review; the proof gate makes this unsuitable for ClawSweeper repair automation right now.

Security
Needs attention: The PR improves log redaction overall, but the debug-bundle log-read failure path can still emit unsanitized exception text into support output.

Review findings

  • [P1] Keep failed setup steps at Error level — src/OpenClaw.SetupEngine/SetupPipeline.cs:163
  • [P2] Preserve brackets when redacting bracketed IPv6 — src/OpenClaw.Shared/TokenSanitizer.cs:167
  • [P1] Sanitize debug-bundle read-failure messages — src/OpenClaw.Tray.WinUI/Helpers/CommandCenterTextHelper.cs:359-363
Review details

Best possible solution:

Keep the diagnostics cleanup direction, but preserve setup Error visibility, make support-bundle sanitizer paths fail closed, align the IPv6 implementation with its tests, then provide final-head runtime proof and AGENTS.md validation.

Do we have a high-confidence way to reproduce the issue?

Yes for the blocking findings from source inspection: current main logs failed setup steps at Error, while PR head makes non-exception failures Warn-only; the IPv6 and debug-bundle paths are also source-reproducible. I did not run validation because this is a read-only review.

Is this the best way to solve the issue?

No as written; centralizing exception detail is maintainable, but it should not downgrade failed setup outcomes or bypass the sanitizer guarantees this PR is adding.

Full review comments:

  • [P1] Keep failed setup steps at Error level — src/OpenClaw.SetupEngine/SetupPipeline.cs:163
    This changes the pipeline failure summary from Error to Warn, but the new StepCompleted Error entry only fires when result.Error is set. Many setup steps return StepResult.Fail(...) for command exits, preflight failures, gateway states, and validation failures without an exception, so those failures would no longer produce any Error-level log entry.
    Confidence: 0.9
  • [P2] Preserve brackets when redacting bracketed IPv6 — src/OpenClaw.Shared/TokenSanitizer.cs:167
    IpV6Pattern matches bracketed literals including the brackets, but RedactIfValidIpV6 always returns only <ipv6>. That makes the added bracketed endpoint test expecting [<ipv6>]:8443 fail and also changes the URI-like shape of raw bracketed addresses.
    Confidence: 0.88
  • [P1] Sanitize debug-bundle read-failure messages — src/OpenClaw.Tray.WinUI/Helpers/CommandCenterTextHelper.cs:359-363
    The log tail builder redacts logPath but appends ex.Message directly for IOException and UnauthorizedAccessException. Those exception messages commonly include the original path, so a copied debug bundle can leak the user profile path despite this PR's sanitization goal.
    Confidence: 0.84

Overall correctness: patch is incorrect
Overall confidence: 0.88

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 99efc50cbc22.

Label changes

Label changes:

  • add merge-risk: 🚨 security-boundary: The debug-bundle failure path can expose unsanitized exception text in a support artifact intended to redact sensitive local details.

Label justifications:

  • P2: This is a normal-priority diagnostics and sanitization improvement with limited direct user impact but broad runtime touch points.
  • merge-risk: 🚨 security-boundary: The debug-bundle failure path can expose unsanitized exception text in a support artifact intended to redact sensitive local details.
  • merge-risk: 🚨 other: Merging as-is could reduce setup failure visibility by downgrading non-exception failed setup steps from Error to Warn.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🦪 silver shellfish and patch quality is 🧂 unranked krab.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs stronger real behavior proof before merge: The linked related PR contains copied gateway log output, but this combined final head still needs redacted runtime log, terminal output, recording, or similar proof for a representative formerly swallowed exception; redact IPs, API keys, phone numbers, non-public endpoints, and other private details before updating the PR body or asking for @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

Security concerns:

  • [medium] Unsanitized exception text in debug bundle — src/OpenClaw.Tray.WinUI/Helpers/CommandCenterTextHelper.cs:359
    BuildRecentTrayLogTail returns ex.Message directly after a failed tray-log read; file access exceptions can include the unredacted path, so the support bundle may leak local user details.
    Confidence: 0.84

What I checked:

Likely related people:

  • Scott Hanselman: Commit 281656f contains the setup pipeline/logger surface and the Error-level failed-step summary affected by this PR. (role: introduced setup logging surface; confidence: high; commits: 281656fc017e; files: src/OpenClaw.SetupEngine/SetupPipeline.cs, src/OpenClaw.SetupEngine/SetupLogger.cs)
  • christineyan4: Current-main blame/log history for setup, sanitizer, and diagnostics helpers runs through commit 85445c7, and the related open sanitization PR contributed commit 0429f49 now included here. (role: recent area contributor and related implementation author; confidence: high; commits: 85445c78066b, 0429f49af86f; files: src/OpenClaw.SetupEngine/SetupPipeline.cs, src/OpenClaw.Shared/TokenSanitizer.cs, src/OpenClaw.Tray.WinUI/Helpers/CommandCenterTextHelper.cs)
  • RBrid: Prior merged history shows work on the companion application and CommandCenterTextHelper area also touched by this diagnostics cleanup, so the PR author has relevant current-main area history beyond this proposal. (role: adjacent feature contributor; confidence: medium; commits: f2fa038bd080; files: src/OpenClaw.Tray.WinUI/Helpers/CommandCenterTextHelper.cs)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 2, 2026
@RBrid RBrid marked this pull request as ready for review June 2, 2026 23:06
@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. P2 Normal priority bug or improvement with limited blast radius. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. labels Jun 2, 2026
christineyan4 and others added 3 commits June 3, 2026 12:47
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ging1

# Conflicts:
#	src/OpenClaw.Tray.WinUI/Services/VoiceService.cs
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. merge-risk: 🚨 other 🚨 Merging this PR has meaningful risk outside the owned taxonomy. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels Jun 4, 2026
Combines the exception-logging effort from PR openclaw#645 (this branch) with
Christine's parallel PR (commit 0429f49) into a unified solution.

From both PRs (kept the stronger parts of each):
* TokenSanitizer: extended detection to cover IPv6 addresses (compressed,
  full-form, IPv4-mapped, dual notation), email addresses, user@host,
  Windows user-profile paths, gateway tokens/credentials. Per-message
  redaction wired into the tray Logger and shared diagnostics paths.
* CommandCenterTextHelper: redacts support-context paths and values for
  diagnostics bundles using the same patterns as TokenSanitizer.
* Type-prefixed log lines reinstated across DataModelStore, SettingsPage,
  ChatWindow, ConnectionPage, NodeService, AppCrashLogger, DeepLinkHandler,
  AudioPipeline, DiagnosticsJsonlService, Logger, ThemeHelper,
  LocalizationHelper, OpenClawChatRoot, ChatExplorationsPanelWindow,
  WebSocketClientBase, WindowsNodeClient, App.xaml.cs, App.ToastActivation,
  and the SetupEngine pipeline. Format: "TypeName: message".

Hardening applied during review iterations:
* IPv6 regex uses negative lookahead (?![A-Fa-f0-9:]|\.\d) so sentence
  punctuation (e.g. "Server at fe80::1.") still redacts while invalid
  partial matches (e.g. a::ffff:192.0.2.1b) do not leak.
* Bracketed-IPv6 alternative uses a structural lookahead (:: or 5+
  colons) plus IPAddress.TryParse validation so non-IPv6 bracketed
  colon-soup is left intact rather than incorrectly redacted as <ipv6>.
* Zone-id capture covers RFC 6874 unreserved chars [-A-Za-z0-9._~]+
  so real interface names (br-1234, eth0.1, wlan_0) are fully consumed
  instead of leaking the suffix after <ipv6>.
* RedactIfValidIpV6 strips %scope before IPAddress.TryParse so textual
  and numeric zone-ids redact consistently.
* Sanitize and SanitizeLogMessage wrap every regex pass in try/catch
  with a 1s RegexMatchTimeout, returning a stable public sentinel
  TokenSanitizer.SanitizerTimeoutSentinel ("[REDACTED_SANITIZER_TIMEOUT]")
  on timeout so adversarial input never tears down the logging or crash
  pipelines. SanitizeLogMessage short-circuits if any pass yields the
  sentinel so subsequent passes cannot re-introduce sensitive data.
* CommandCenterTextHelper.RedactSupportPath / RedactSupportValue mirror
  the same try/catch + sentinel pattern, and share TokenSanitizer's
  IpV6Pattern + RedactIfValidIpV6 via [InternalsVisibleTo] to keep the
  two surfaces in lockstep.
* New TokenSanitizerTests cover invalid bracketed forms, partial-leak
  resistance, sentence-punctuation redaction, RFC 6874 zone-ids,
  sentinel contract, and adversarial input stability.

Validation: Shared.Tests 2083 passed / 29 skipped, Tray.Tests 938 passed.

Co-authored-by: Christine Yan <chrisyan@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@clawsweeper clawsweeper Bot added the merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. label Jun 4, 2026
@kmahone
Copy link
Copy Markdown
Collaborator

kmahone commented Jun 5, 2026

@RBrid please take a look at these items from clawsweeper:

[P1] Non-exception setup failures would lose their existing Error-level summary and become Warn-only, which makes failed setup runs less visible in logs and console output.
[P1] The debug-bundle log-read failure path can leak unsanitized exception messages even though this PR is meant to sanitize copied support diagnostics.
[P1] The IPv6 sanitizer helper appears inconsistent with its own added bracketed-address test, so the shared test suite should fail or the expected URI shape is wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 other 🚨 Merging this PR has meaningful risk outside the owned taxonomy. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. P2 Normal priority bug or improvement with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants