feat: resolve all gRPC Phase 2 debt — full bridge fidelity + KernelService implementation by bkrabach · Pull Request #36 · microsoft/amplifier-core

bkrabach · 2026-03-05T10:36:48Z

Summary

Resolves all gRPC Phase 2 technical debt — 15 code TODO(grpc-v2) markers eliminated, all 9 KernelService RPCs implemented, full data fidelity across the gRPC bridge layer. Remote cross-language orchestrators can now make callbacks to the kernel.

What's included

Proto Schema Fixes (Layer 1)

Added optional keyword to 5 proto fields: Usage.reasoning_tokens, Usage.cache_read_tokens, Usage.cache_creation_tokens, ApprovalRequest.timeout, HookResult.approval_timeout
Wire-compatible change — old readers unaffected

Bidirectional Conversions (Layer 2 — bulk of the work)

Role enum ↔ proto Role conversion helpers
Message ↔ proto Message with all 7 ContentBlock variants (text, thinking, redacted_thinking, tool_call, tool_result, image, reasoning)
ChatRequest ↔ proto ChatRequest (with ToolSpec, ResponseFormat, ToolChoice)
ChatResponse ↔ proto ChatResponse (with ToolCall, Usage, Degradation)
HookResult native → proto (reverse of existing proto → native)

Bridge Fixes (Layer 3)

GrpcContextBridge: Full message fidelity — role, name, tool_call_id, metadata, BlockContent all preserved. Provider name populated in get_messages_for_request().
GrpcProviderBridge::complete(): No longer a stub — uses ChatRequest/ChatResponse conversions
GrpcOrchestratorBridge: session_id stored at construction and transmitted in requests. 5 discarded params documented as by-design (remote orchestrators use KernelService callbacks).

Session Change (Layer 3)

Session now holds Arc<Coordinator> internally (was owned by value)
New coordinator_shared() method returns Arc<Coordinator> for sharing with KernelService
Existing coordinator() / coordinator_mut() APIs preserved

KernelService RPCs (Layer 4 — all 8 stubs now implemented)

GetCapability / RegisterCapability — capability store read/write
GetMountedModule — module registry query (tools + providers)
AddMessage / GetMessages — context manager access with full Message conversion
EmitHook / EmitHookAndCollect — hook registry with HookResult conversion
CompleteWithProvider — provider lookup + full ChatRequest/ChatResponse roundtrip
CompleteWithProviderStreaming — one-shot stream wrap of provider.complete() (true streaming deferred to Provider trait extension)

Cleanup

Zero TODO(grpc-v2) markers remain in source code
Updated audit design doc to mark all findings as RESOLVED

Design docs

docs/plans/2026-03-04-grpc-v2-debt-fix-design.md
docs/plans/2026-03-04-grpc-v2-debt-fix-implementation.md

Key design decisions

KernelServiceImpl stays per-session (single Arc, not a session registry)
session_id stored on bridge struct at construction (not passed through Orchestrator trait — would be breaking)
Session stores Arc internally — minimal change, existing API preserved
CompleteWithProviderStreaming wraps single complete() as one-shot stream (true streaming requires Provider trait change — separate future PR)
Type-safe message parsing via serde_json::from_value::() (not hand-parsing JSON)

Test Plan

391 total tests passing (366 unit + 12 E2E + 13 doc tests)
cargo clippy clean (0 warnings with -D warnings)
Zero TODO(grpc-v2) markers in source code
Zero Status::unimplemented stubs in KernelService
5 proto optional fields verified
cargo check --features wasm compiles clean
CI workflow runs successfully

Covers all 15 TODO(grpc-v2) markers and 8 stubbed KernelService RPCs. 4-layer approach: proto schema → conversions → bridge fixes → KernelService. Single PR, layered commits.

Make the fields Usage.reasoning_tokens, Usage.cache_read_tokens, Usage.cache_creation_tokens, ApprovalRequest.timeout, and HookResult.approval_timeout optional in the proto schema and generated Rust code. This fixes the Some(0)/None ambiguity where zero values were previously indistinguishable from absent values. - Proto schema: add `optional` keyword to 5 fields - Generated code: fields become Option<i32>/Option<f64> - conversions.rs: use .map() for Option mapping both directions - grpc_approval.rs: remove map_approval_timeout, pass directly - grpc_hook.rs: unwrap_or(300.0) for approval_timeout default - Add usage_some_zero_roundtrips_correctly test - Replace 2 approval timeout tests with 3 optional-semantics tests

…ole_to_native)

…e conversions

Add Message conversion functions to conversions.rs: - Private helpers: native_content_block_to_proto and proto_content_block_to_native handling all 7 ContentBlock variants (Text, Thinking, RedactedThinking, ToolCall, ToolResult, Image, Reasoning) with visibility mapping - Private helpers: native_visibility_to_proto and proto_visibility_to_native - Public: native_message_to_proto mapping role, content, name, tool_call_id, metadata_json - Public: proto_message_to_native returning Result<Message, String> with error on None content, empty-string→None for name/tool_call_id, JSON parse for metadata - 4 tests: message_text_content_roundtrip, message_block_content_text_roundtrip, message_with_tool_call_id_roundtrip, message_none_content_returns_error

…ters

…roto content block converter

… block types

Add native_chat_request_to_proto() and proto_chat_request_to_native() in crates/amplifier-core/src/generated/conversions.rs. Mappings: - messages: Vec<Message> via native_message_to_proto / proto_message_to_native - tools: Option<Vec<ToolSpec>> ↔ Vec<ToolSpecProto> (name, description, parameters_json) - response_format: Option<ResponseFormat> ↔ proto ResponseFormat oneof (Text/Json/JsonSchema) - temperature, top_p, timeout: Option<f64> ↔ f64 (0.0 sentinel = not set) - max_output_tokens: Option<i64> ↔ i32 (0 sentinel = not set, with overflow clamp) - stream: Option<bool> ↔ bool (false sentinel = not set) - conversation_id, model, reasoning_effort, stop: Option ↔ string/repeated (empty = not set) - metadata: Option<HashMap> ↔ metadata_json string - tool_choice: Option<ToolChoice> ↔ String (JSON object strings parsed back to ToolChoice::Object) Tests: 7 roundtrip tests covering minimal, full-fields, tools, all three ResponseFormat variants, ToolChoice::Object, and multiple-message requests. Task 3 of 17.

- Add native_chat_response_to_proto() mapping Vec<ContentBlock> to JSON string, ToolCall arguments to arguments_json, Usage/Degradation via existing From impls, and finish_reason/metadata with empty-string sentinels - Add proto_chat_response_to_native() reversing all mappings - Add 4 roundtrip tests: minimal, full fields, tool_calls, empty content

Add native_hook_result_to_proto() in generated/conversions.rs that maps all 14 HookResult fields from native models to proto types: - action: HookAction enum → proto i32 (all 5 variants) - context_injection_role: ContextInjectionRole → proto i32 (all 3 variants) - approval_default: Allow/Deny → proto Approve/Deny i32 - user_message_level: UserMessageLevel → proto i32 (all 3 variants) - approval_timeout: f64 → Option<f64> (always Some, preserves value) - approval_options: Option<Vec<String>> → Vec<String> (None → empty) - All Option<String> fields → String (None → empty string) - data: Option<HashMap> → data_json String (None → empty) - Bool fields: ephemeral, suppress_output, append_to_last_tool_result (direct) - extensions: dropped (proto has no extensions field) Also expose proto_to_native_hook_result as pub(crate) on GrpcHookBridge to enable roundtrip tests in conversions.rs. Tests added (12 total): - Default field mapping verification - All HookAction variant mappings - All ContextInjectionRole variant mappings - All ApprovalDefault variant mappings (Allow→Approve, Deny→Deny) - All UserMessageLevel variant mappings - String option fields → proto string fields - Bool fields mapping - approval_options Some/None → vec/empty vec - approval_timeout → Some(f64) - data_json Some/None → JSON string/empty - Full roundtrip via GrpcHookBridge::proto_to_native_hook_result

- Replace value_to_proto_message() hand-rolled lossy conversion with serde_json::from_value::<Message>() + native_message_to_proto(), preserving role, name, tool_call_id, metadata, and all ContentBlock variants. Non-parseable values fall back to text-only with a warning. - Replace proto_message_to_value() text-only mapping (BlockContent → Null) with proto_message_to_native() + serde_json::to_value(), producing a fully-typed roundtrip for all ContentBlock variants. - Fix get_messages_for_request() to populate provider_name from provider.name() instead of always sending an empty string. - Remove all TODO(grpc-v2) markers — replaced with proper doc-comments. - Update tests: replace old tests that documented the lossy behaviour (proto_message_to_value_block_content_is_null → not null; proto_message_to_value_text_content_roundtrip → uses typed Message path) and add full-fidelity tests for role/name/tool_call_id preservation and BlockContent round-trip.

…ChatResponse conversion Replace the Phase-2 stub that returned 'not yet implemented' with a working implementation that: 1. Converts the native ChatRequest to proto using native_chat_request_to_proto() 2. Calls self.client.lock().await.complete(proto_request).await via tonic 3. Maps gRPC Status errors to ProviderError::Other { message: "gRPC call failed: ..." } 4. Converts the proto ChatResponse back to native using proto_chat_response_to_native() 5. Returns Ok(native_response) Also adds: - GrpcProviderBridge::new_for_testing() (#[cfg(test)]) to allow constructing the bridge with a pre-built tonic client (lazy channel) without a live server - complete_attempts_grpc_call_not_stub async test that verifies the method no longer returns the stub error message (fails RED before fix, passes GREEN after) Task 7 of 17.

- Add session_id: String field to GrpcOrchestratorBridge struct - Update connect() to accept session_id parameter for callback routing - Populate session_id in OrchestratorExecuteRequest from self.session_id - Replace TODO(grpc-v2) markers with clear doc comments explaining that remote orchestrators access context/providers/tools/hooks/coordinator via KernelService callbacks using session_id routing - Add load_grpc_orchestrator() to transport.rs (parallel to load_grpc_tool) - Update tests: replace old TODO-presence assertions with new tests that verify session_id field declaration, self.session_id usage, and absence of the String::new() placeholder All 329 tests passing, clippy clean.

…rnelService - Change Session.coordinator field from Coordinator to Arc<Coordinator> - Add coordinator_shared() method returning Arc<Coordinator> for sharing with KernelServiceImpl (prerequisite for Tasks 10-15) - Update coordinator() to deref through Arc (Arc<T>: Deref<Target=T>) - Update coordinator_mut() to use Arc::get_mut() with lifecycle docs (panics if called after Arc has been shared — setup-only API) - Add import: use std::sync::Arc in session.rs module scope - Add 3 tests: coordinator_shared_returns_arc_to_same_coordinator, coordinator_and_coordinator_mut_still_work, coordinator_shared_reflects_mounted_modules All 332 tests pass, clippy clean.

- Add proto_chat_request_to_native and native_chat_response_to_proto imports - Implement complete_with_provider: look up provider by name, convert proto ChatRequest to native, call provider.complete(), convert native ChatResponse back to proto - Return Status::not_found when provider is not mounted - Return Status::invalid_argument when request field is missing - Return Status::internal on ProviderError Tests added: - complete_with_provider_returns_response_from_mounted_provider - complete_with_provider_not_found_returns_not_found_status - complete_with_provider_missing_request_returns_invalid_argument - complete_with_provider_records_call_in_provider

…shot wrap)

- Source code was already clean (all markers removed in Tasks 0-15) - Update audit-fix-design.md: mark S-1 through S-4 as RESOLVED with fix details - Confirmed zero TODO(grpc-v2) markers remain in amplifier-core/crates/ All 15+ TODO(grpc-v2) markers and 8 stubbed KernelService RPCs fixed: - Context bridge: full bidirectional conversions for all message fields - Orchestrator bridge: session_id routing, KernelService RPC callbacks - Approval bridge: optional double timeout in proto - Conversions: optional int32 for token counts - All 9 KernelService RPCs implemented 366 tests passing, clippy clean.

bkrabach · 2026-03-07T11:32:48Z

Closing — this work has been merged into the dev/cross-language-sdk integration branch. Will be delivered to main via a single PR when all cross-language SDK work is dogfooted and ready for release.

bkrabach added 27 commits March 4, 2026 22:08

docs: add gRPC Phase 2 debt fix design

bb257a0

Covers all 15 TODO(grpc-v2) markers and 8 stubbed KernelService RPCs. 4-layer approach: proto schema → conversions → bridge fixes → KernelService. Single PR, layered commits.

docs: add gRPC Phase 2 debt fix implementation plan

b04653f

feat: add Role enum conversion helpers (native_role_to_proto, proto_r…

bf5fcd1

…ole_to_native)

style: hoist repeated test imports and add negative edge case for rol…

de8a6bf

…e conversions

style: hoist role conversion imports to module level

156a3ed

style: align logging and reduce verbose paths in content block conver…

5884586

…ters

style: add deserialization warn logging and hoist HashMap import in p…

aca0fb9

…roto content block converter

test: add per-variant ContentBlock roundtrip tests for all 6 non-Text…

a3de5ea

… block types

feat: implement GetCapability and RegisterCapability KernelService RPCs

9c12849

feat: implement GetMountedModule KernelService RPC

ed47bf7

feat: implement AddMessage and GetMessages KernelService RPCs

4eb297b

feat: implement EmitHook and EmitHookAndCollect KernelService RPCs

884461a

feat: implement CompleteWithProviderStreaming KernelService RPC (one-…

6887594

…shot wrap)

ci: split Node.js binding tests into separate CI job

31b7425

revert: restore original CI workflow — node bindings not on this branch

c976080

style: apply cargo fmt to fix formatting drift

98b8952

bkrabach force-pushed the feat/grpc-v2-debt-fix branch from bcb509a to 98b8952 Compare March 5, 2026 10:49

bkrabach mentioned this pull request Mar 6, 2026

feat: Cross-Language SDK — Phases 1-4 (gRPC, WASM Bridges, Module Resolver) #38

Closed

5 tasks

bkrabach closed this Mar 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: resolve all gRPC Phase 2 debt — full bridge fidelity + KernelService implementation#36

feat: resolve all gRPC Phase 2 debt — full bridge fidelity + KernelService implementation#36
bkrabach wants to merge 27 commits intomainfrom
feat/grpc-v2-debt-fix

bkrabach commented Mar 5, 2026

Uh oh!

bkrabach commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bkrabach commented Mar 5, 2026

Summary

What's included

Design docs

Key design decisions

Test Plan

Uh oh!

bkrabach commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant