Skip to content

Conversation

@shellmayr
Copy link
Member

@shellmayr shellmayr commented Jan 15, 2026

Add instrumentation for the claude-agent-sdk package, which provides
a Python interface to interact with Claude Code CLI.

The integration captures:

  • gen_ai.system: "claude-agent-sdk-python"
  • gen_ai.operation.name: "chat"
  • gen_ai.request.model / gen_ai.response.model
  • gen_ai.request.messages (when PII enabled)
  • gen_ai.response.text (when PII enabled)
  • gen_ai.request.available_tools
  • gen_ai.response.tool_calls
  • gen_ai.usage.input_tokens / output_tokens / total_tokens
  • gen_ai.usage.input_tokens.cached

Instrumented methods:

  • query() async generator function for one-shot queries
  • ClaudeSDKClient.query() and receive_response() for interactive sessions

Closes TET-1743
Co-Authored-By: Claude Sonnet 4.5 [email protected]

@github-actions
Copy link
Contributor

github-actions bot commented Jan 15, 2026

Semver Impact of This PR

🟡 Minor (new features)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).


New Features ✨

Span Streaming

  • feat(span-streaming): Add spans to telemetry pipeline, add span name and attributes (3) by sentrivana in #5399
  • feat(span-streaming): Add span batcher (2) by sentrivana in #5398
  • feat(span-streaming): Add experimental trace_lifecycle switch (1) by sentrivana in #5397

Other

  • feat(ai): add claude code agents sdk integration by shellmayr in #5316
  • feat(integration): add gen_ai.conversation.id if available by constantinius in #5307

Bug Fixes 🐛

Google Genai

  • fix(google-genai): Token reporting by alexander-alderman-webb in #5404
  • fix(google-genai): deactivate google genai when langchain is used by shellmayr in #5389

Span Streaming

  • fix(span-streaming): Always preserialize attributes by sentrivana in #5407
  • fix(span-streaming): Warn about thread usage if any batcher is active by sentrivana in #5408

Other

  • fix(anthropic): Token reporting by alexander-alderman-webb in #5403
  • fix(arq): handle settings_cls passed as keyword argument by nc9 in #5393
  • fix(dramatiq): cleanup isolated scope and transaction when message is skipped by frankie567 in #5346
  • fix(openai): Token reporting by alexander-alderman-webb in #5406
  • fix(openai-agents): Inject propagation headers for HostedMCPTool when streaming by alexander-alderman-webb in #5405
  • fix: Adapt to new packaging in toxgen by sentrivana in #5382

Internal Changes 🔧

Fastmcp

  • test(fastmcp): Wrap prompt in Message by alexander-alderman-webb in #5411
  • test(fastmcp): Remove test_fastmcp_without_request_context() by alexander-alderman-webb in #5412
  • test(fastmcp): Use AsyncClient for SSE by alexander-alderman-webb in #5400
  • test(fastmcp): Use TestClient for Streamable HTTP by alexander-alderman-webb in #5384
  • test(fastmcp): Simulate stdio transport with memory streams by alexander-alderman-webb in #5333

Mcp

  • test(mcp): Use AsyncClient for SSE by alexander-alderman-webb in #5396
  • test(mcp): Use TestClient for Streamable HTTP by alexander-alderman-webb in #5383
  • test(mcp): Remove unused stdio helpers by alexander-alderman-webb in #5409
  • test(mcp): Simulate stdio transport with memory streams by alexander-alderman-webb in #5329

Other

  • ci: 🤖 Update test matrix with new releases (02/02) by github-actions in #5413
  • ci: Update tox and pin packaging version for tox by alexander-alderman-webb in #5381
  • ci: migration to the new codecov action by MathurAditya724 in #5392

🤖 This preview updates automatically when you update the PR.

@constantinius
Copy link
Contributor

Excited about this!
Additional things we could capture:

@linear
Copy link

linear bot commented Jan 23, 2026

Comment on lines 372 to 401
chat_span = get_start_span_function()(
op=OP.GEN_AI_CHAT,
name=f"claude-agent-sdk query {model}".strip(),
origin=ClaudeAgentSDKIntegration.origin,
)
chat_span.__enter__()

with capture_internal_exceptions():
_set_span_input_data(chat_span, prompt, options, integration)

collected_messages = []
try:
async for message in original_func(
prompt=prompt, options=options, **kwargs
):
collected_messages.append(message)
yield message
except Exception as exc:
_capture_exception(exc)
raise
finally:
with capture_internal_exceptions():
_set_span_output_data(chat_span, collected_messages, integration)
chat_span.__exit__(None, None, None)

with capture_internal_exceptions():
_process_tool_executions(collected_messages, integration)

with capture_internal_exceptions():
_end_invoke_agent_span(invoke_span, collected_messages, integration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does your agent know how to use a context manager 😄

Suggested change
chat_span = get_start_span_function()(
op=OP.GEN_AI_CHAT,
name=f"claude-agent-sdk query {model}".strip(),
origin=ClaudeAgentSDKIntegration.origin,
)
chat_span.__enter__()
with capture_internal_exceptions():
_set_span_input_data(chat_span, prompt, options, integration)
collected_messages = []
try:
async for message in original_func(
prompt=prompt, options=options, **kwargs
):
collected_messages.append(message)
yield message
except Exception as exc:
_capture_exception(exc)
raise
finally:
with capture_internal_exceptions():
_set_span_output_data(chat_span, collected_messages, integration)
chat_span.__exit__(None, None, None)
with capture_internal_exceptions():
_process_tool_executions(collected_messages, integration)
with capture_internal_exceptions():
_end_invoke_agent_span(invoke_span, collected_messages, integration)
chat_span = get_start_span_function()(
op=OP.GEN_AI_CHAT,
name=f"claude-agent-sdk query {model}".strip(),
origin=ClaudeAgentSDKIntegration.origin,
)
with chat_span() as span:
with capture_internal_exceptions():
_set_span_input_data(span, prompt, options, integration)
collected_messages = []
try:
async for message in original_func(
prompt=prompt, options=options, **kwargs
):
collected_messages.append(message)
yield message
except Exception as exc:
_capture_exception(exc)
raise
finally:
with capture_internal_exceptions():
_set_span_output_data(span, collected_messages, integration)
with capture_internal_exceptions():
_process_tool_executions(collected_messages, integration)
with capture_internal_exceptions():
_end_invoke_agent_span(invoke_span, collected_messages, integration)

model = getattr(options, "model", "") if options else ""
invoke_span = _start_invoke_agent_span(prompt, options, integration)

chat_span = get_start_span_function()(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an invoke_agent span active here, why not just always start a span?

Comment on lines 439 to 445
messages = self._sentry_query_context.get("messages", [])
with capture_internal_exceptions():
_set_span_output_data(chat_span, messages, integration)
chat_span.__exit__(None, None, None)
with capture_internal_exceptions():
_end_invoke_agent_span(invoke_span, messages, integration)
self._sentry_query_context = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exiting spans has caused many uncaught exceptions already 😄

Suggested change
messages = self._sentry_query_context.get("messages", [])
with capture_internal_exceptions():
_set_span_output_data(chat_span, messages, integration)
chat_span.__exit__(None, None, None)
with capture_internal_exceptions():
_end_invoke_agent_span(invoke_span, messages, integration)
self._sentry_query_context = {}
messages = self._sentry_query_context.get("messages", [])
with capture_internal_exceptions():
_set_span_output_data(chat_span, messages, integration)
chat_span.__exit__(None, None, None)
_end_invoke_agent_span(invoke_span, messages, integration)
self._sentry_query_context = {}

tool_uses = {}
tool_results = {}

for message in messages:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

narrower than Anthropic message types, so this looks suspicious.


def _extract_message_data(messages: list) -> dict:
"""Extract relevant data from a list of messages."""
data = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a class type instead of a dictionary for passing info between functions, stronger typing.

)

if _should_include_prompts(integration):
messages = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

system prompts go in their own attribute.

async for message in original_func(
prompt=prompt, options=options, **kwargs
):
collected_messages.append(message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your collecting user-held references here, leading to possible race conditions.

Comment on lines +99 to +102
def test_extract_text_returns_none_for_non_assistant():
result = make_result_message(usage=None)
assert _extract_text_from_message(result) is None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need all these unit tests, more work in the long run.

Instead, call the userspace API, and parameterize on as many input schemas as possible to achieve good test coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants