Skip to content

Commit b830a24

Browse files
lievanalyshawang
authored andcommitted
feat(llmobs): track prompt caching for anthropic sdk (#13757)
Tracks number of tokens read from and written to the prompt cache for anthropic https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching anthropic returns `cache_creation/read_input_tokens` in their usage field. We map these to `cache_write/read_input_tokens` keys in our `metrics` field. Testing is blocked on DataDog/dd-apm-test-agent#217 ### implementation note Right now, we are using `get_llmobs_metrics_tags` to set metrics for anthropic, which depends on using `set_metric` and `get_metric`. We do not want to continue this pattern for prompt caching, so we instead directly extract it out from `response.usage`field. The caveat is that for the streamed case, the `usage` field is a dictionary that is manually constructed by us when parsing out streamed chunks ### Follow ups 1. Move all the unit tests to use `llmobs_events` fixture 2. De-couple `metrics` parsing from set/get metrics completely ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
1 parent 5d56f69 commit b830a24

File tree

8 files changed

+1226
-1
lines changed

8 files changed

+1226
-1
lines changed

ddtrace/contrib/internal/anthropic/_streaming.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -198,6 +198,14 @@ def _on_message_start_chunk(chunk, message):
198198
message["role"] = chunk_role
199199
if chunk_usage:
200200
message["usage"] = {"input_tokens": _get_attr(chunk_usage, "input_tokens", 0)}
201+
202+
cache_write_tokens = _get_attr(chunk_usage, "cache_creation_input_tokens", None)
203+
cache_read_tokens = _get_attr(chunk_usage, "cache_read_input_tokens", None)
204+
if cache_write_tokens is not None:
205+
message["usage"]["cache_creation_input_tokens"] = cache_write_tokens
206+
if cache_read_tokens is not None:
207+
message["usage"]["cache_read_input_tokens"] = cache_read_tokens
208+
201209
return message
202210

203211

@@ -250,6 +258,14 @@ def _on_message_delta_chunk(chunk, message):
250258
if chunk_usage:
251259
message_usage = message.get("usage", {"output_tokens": 0, "input_tokens": 0})
252260
message_usage["output_tokens"] = _get_attr(chunk_usage, "output_tokens", 0)
261+
262+
cache_creation_tokens = _get_attr(chunk_usage, "cache_creation_input_tokens", None)
263+
cache_read_tokens = _get_attr(chunk_usage, "cache_read_input_tokens", None)
264+
if cache_creation_tokens is not None:
265+
message_usage["cache_creation_input_tokens"] = cache_creation_tokens
266+
if cache_read_tokens is not None:
267+
message_usage["cache_read_input_tokens"] = cache_read_tokens
268+
253269
message["usage"] = message_usage
254270

255271
return message

ddtrace/llmobs/_integrations/anthropic.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@
77
from typing import Union
88

99
from ddtrace.internal.logger import get_logger
10+
from ddtrace.llmobs._constants import CACHE_READ_INPUT_TOKENS_METRIC_KEY
11+
from ddtrace.llmobs._constants import CACHE_WRITE_INPUT_TOKENS_METRIC_KEY
1012
from ddtrace.llmobs._constants import INPUT_MESSAGES
1113
from ddtrace.llmobs._constants import INPUT_TOKENS_METRIC_KEY
1214
from ddtrace.llmobs._constants import METADATA
@@ -184,6 +186,8 @@ def _extract_usage(self, span: Span, usage: Dict[str, Any]):
184186
return
185187
input_tokens = _get_attr(usage, "input_tokens", None)
186188
output_tokens = _get_attr(usage, "output_tokens", None)
189+
cache_write_tokens = _get_attr(usage, "cache_creation_input_tokens", None)
190+
cache_read_tokens = _get_attr(usage, "cache_read_input_tokens", None)
187191

188192
metrics = {}
189193
if input_tokens is not None:
@@ -192,6 +196,10 @@ def _extract_usage(self, span: Span, usage: Dict[str, Any]):
192196
metrics[OUTPUT_TOKENS_METRIC_KEY] = output_tokens
193197
if input_tokens is not None and output_tokens is not None:
194198
metrics[TOTAL_TOKENS_METRIC_KEY] = input_tokens + output_tokens
199+
if cache_write_tokens is not None:
200+
metrics[CACHE_WRITE_INPUT_TOKENS_METRIC_KEY] = cache_write_tokens
201+
if cache_read_tokens is not None:
202+
metrics[CACHE_READ_INPUT_TOKENS_METRIC_KEY] = cache_read_tokens
195203
return metrics
196204

197205
def _get_base_url(self, **kwargs: Dict[str, Any]) -> Optional[str]:
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
features:
3+
- |
4+
LLM Observability: This introduces capturing the number of input tokens read and written to the cache for Anthropic prompt caching use cases.
5+
Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
interactions:
2+
- request:
3+
body: '{"max_tokens": 100, "messages": [{"role": "user", "content": "What is a
4+
system"}], "model": "claude-sonnet-4-20250514", "system": [{"type": "text",
5+
"text": "Hardware engineering best practices guide: farewell farewell farewell
6+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
7+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
8+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
9+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
10+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
11+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
12+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
13+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
14+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
15+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
16+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
17+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
18+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
19+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
20+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
21+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
22+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
23+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
24+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
25+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
26+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
27+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
28+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
29+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
30+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
31+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
32+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
33+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
34+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
35+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
36+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
37+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
38+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
39+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
40+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
41+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
42+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
43+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
44+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
45+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
46+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
47+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
48+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
49+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
50+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
51+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
52+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
53+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
54+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
55+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
56+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
57+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
58+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
59+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
60+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
61+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
62+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
63+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
64+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
65+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
66+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
67+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
68+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
69+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
70+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
71+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
72+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
73+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
74+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
75+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
76+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
77+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
78+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
79+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
80+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
81+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
82+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
83+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
84+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
85+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
86+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
87+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
88+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
89+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
90+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
91+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
92+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
93+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
94+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
95+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
96+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
97+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
98+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
99+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
100+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
101+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
102+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
103+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
104+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
105+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
106+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
107+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
108+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
109+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
110+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
111+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
112+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
113+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
114+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
115+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
116+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
117+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
118+
farewell farewell farewell farewell farewell farewell farewell farewell farewell
119+
farewell farewell farewell farewell ", "cache_control": {"type": "ephemeral"}}],
120+
"temperature": 0.1}'
121+
headers:
122+
accept:
123+
- application/json
124+
accept-encoding:
125+
- gzip, deflate
126+
anthropic-beta:
127+
- prompt-caching-2024-07-31
128+
anthropic-version:
129+
- '2023-06-01'
130+
connection:
131+
- keep-alive
132+
content-length:
133+
- '9480'
134+
content-type:
135+
- application/json
136+
host:
137+
- api.anthropic.com
138+
user-agent:
139+
- Anthropic/Python 0.28.1
140+
x-stainless-arch:
141+
- arm64
142+
x-stainless-async:
143+
- 'false'
144+
x-stainless-lang:
145+
- python
146+
x-stainless-os:
147+
- MacOS
148+
x-stainless-package-version:
149+
- 0.28.1
150+
x-stainless-runtime:
151+
- CPython
152+
x-stainless-runtime-version:
153+
- 3.10.13
154+
method: POST
155+
uri: https://api.anthropic.com/v1/messages
156+
response:
157+
body:
158+
string: !!binary |
159+
H4sIAAAAAAAAAwAAAP//ZFJdaxtBDPwrQnkoHOf0bOJS7i02KWmhJfSD0DTFqLuyb/Ge9rqrdZya
160+
/PdyZ7t12ieBpBnNDNqhs1hjm1aLavz1y5v86nY+e313fTf7MH/PrX03mWGJ+thxv8Up0YqxxBh8
161+
36CUXFISxRLbYNljjcZTtjxKQYR1dDGaVJNpNR1fYIkmiLIo1t92R0rlbQ8eSo2XUBTpMSm3RQEu
162+
AQmEuCJxv9iCCd6zURcEwhKcKEfTHzE6DNsuCIsm0IYUHkJcg4YVa8MRNACZxvGGgSB1bNzSGehy
163+
7EJiCBGWWQbmc7jmyECRQRuGNT+CaSiSUY4uqTOpvpd7OTuDeYgMV57b4WZYAsGnQXl9LyMoivkf
164+
QS9vKGoqihreinUbZzN54CNyUNvSmiF3w829/z3JR/bUy0qN6waG+d5w3wISu0+BDo0frA/McpLF
165+
nuVm77PHX/51vwrkT62fHIfEccMH9CxksRQdDwJuG9IXCZwYny33AmCTziFkTc7yfwauZONikN5q
166+
j/7cMPBWOQp5GL5hq+AEn76XmDR0i8iUgvSvRtuFhjVLwsMo8c/MYhhryd6XmIdXrHfopMt6XK7H
167+
4xINmYYXJvIQ3uL5QnWcRyb7z2xSTaclhqzPGKuqxD4RZ3ihjiPW2D+9pWjx6ek3AAAA//8DAOcF
168+
+ytCAwAA
169+
headers:
170+
CF-RAY:
171+
- 95986143ad1fde0e-EWR
172+
Connection:
173+
- keep-alive
174+
Content-Encoding:
175+
- gzip
176+
Content-Type:
177+
- application/json
178+
Date:
179+
- Thu, 03 Jul 2025 18:17:33 GMT
180+
Server:
181+
- cloudflare
182+
Transfer-Encoding:
183+
- chunked
184+
X-Robots-Tag:
185+
- none
186+
anthropic-organization-id:
187+
- 4257e925-ee99-4ee8-9c62-8e53716d5203
188+
anthropic-ratelimit-input-tokens-limit:
189+
- '20000000'
190+
anthropic-ratelimit-input-tokens-remaining:
191+
- '20000000'
192+
anthropic-ratelimit-input-tokens-reset:
193+
- '2025-07-03T18:17:30Z'
194+
anthropic-ratelimit-output-tokens-limit:
195+
- '2500000'
196+
anthropic-ratelimit-output-tokens-remaining:
197+
- '2500000'
198+
anthropic-ratelimit-output-tokens-reset:
199+
- '2025-07-03T18:17:33Z'
200+
anthropic-ratelimit-tokens-limit:
201+
- '22500000'
202+
anthropic-ratelimit-tokens-remaining:
203+
- '22500000'
204+
anthropic-ratelimit-tokens-reset:
205+
- '2025-07-03T18:17:30Z'
206+
cf-cache-status:
207+
- DYNAMIC
208+
request-id:
209+
- req_011CQkYXWEaYLtjyRLVfhAKc
210+
strict-transport-security:
211+
- max-age=31536000; includeSubDomains; preload
212+
via:
213+
- 1.1 google
214+
status:
215+
code: 200
216+
message: OK
217+
version: 1

0 commit comments

Comments
 (0)