-
Notifications
You must be signed in to change notification settings - Fork 470
feat(llmobs): track prompt caching for anthropic sdk #13757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 275 ± 4 ms. The average import time from base is: 281 ± 4 ms. The import time difference between this PR and base is: -5.1 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
BenchmarksBenchmark execution time: 2025-07-04 19:17:10 Comparing candidate commit 43deda5 in PR branch Found 0 performance improvements and 1 performance regressions! Performance is the same for 546 metrics, 3 unstable metrics. scenario:iastaspectsospath-ospathsplitdrive_aspect
|
…/dd-trace-py into evan.li/anthropic-prompt-caching
…/dd-trace-py into evan.li/anthropic-prompt-caching
…/dd-trace-py into evan.li/anthropic-prompt-caching
…/dd-trace-py into evan.li/anthropic-prompt-caching
Tracks number of tokens read from and written to the prompt cache for anthropic https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching anthropic returns `cache_creation/read_input_tokens` in their usage field. We map these to `cache_write/read_input_tokens` keys in our `metrics` field. Testing is blocked on DataDog/dd-apm-test-agent#217 ### implementation note Right now, we are using `get_llmobs_metrics_tags` to set metrics for anthropic, which depends on using `set_metric` and `get_metric`. We do not want to continue this pattern for prompt caching, so we instead directly extract it out from `response.usage`field. The caveat is that for the streamed case, the `usage` field is a dictionary that is manually constructed by us when parsing out streamed chunks ### Follow ups 1. Move all the unit tests to use `llmobs_events` fixture 2. De-couple `metrics` parsing from set/get metrics completely ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
Tracks number of tokens read from and written to the prompt cache for anthropic
https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
anthropic returns
cache_creation/read_input_tokensin their usage field.We map these to
cache_write/read_input_tokenskeys in ourmetricsfield.Testing is blocked on DataDog/dd-apm-test-agent#217
implementation note
Right now, we are using
get_llmobs_metrics_tagsto set metrics for anthropic, which depends on usingset_metricandget_metric. We do not want to continue this pattern for prompt caching, so we instead directly extract it out fromresponse.usagefield.The caveat is that for the streamed case, the
usagefield is a dictionary that is manually constructed by us when parsing out streamed chunksFollow ups
llmobs_eventsfixturemetricsparsing from set/get metrics completelyChecklist
Reviewer Checklist