Skip to content

fix(bedrock): resolve cache strategy for ARN-based application inference profiles#2257

Open
Zelys-DFKH wants to merge 5 commits intostrands-agents:mainfrom
Zelys-DFKH:fix/cache-strategy-application-inference-profile
Open

fix(bedrock): resolve cache strategy for ARN-based application inference profiles#2257
Zelys-DFKH wants to merge 5 commits intostrands-agents:mainfrom
Zelys-DFKH:fix/cache-strategy-application-inference-profile

Conversation

@Zelys-DFKH
Copy link
Copy Markdown
Contributor

Description

CacheConfig(strategy="auto") silently disabled caching for ARN-based application inference profiles like arn:aws:bedrock:us-east-1:123456:application-inference-profile/abc123. Application inference profiles use opaque resource IDs that don't contain "claude" or "anthropic", so _cache_strategy's string-match returns None and logs a spurious warning, even when the profile is backed by Claude Sonnet.

The fix detects the application-inference-profile ARN pattern and calls GetInferenceProfile on the Bedrock management API to find the underlying foundation model. That model ARN does contain "anthropic" for Claude-backed profiles, so the existing string-match works correctly from there.

Two implementation notes:

  • GetInferenceProfile is on the bedrock management client, not bedrock-runtime. The fix stores the boto session in __init__ so the management client can be created on demand using the same credentials and region. It requires the bedrock:GetInferenceProfile IAM permission; on error it falls back to None and logs a debug message pointing to CacheConfig(strategy="anthropic") as an explicit workaround.
  • The resolved strategy is cached on the instance (one API call per model) and cleared by update_config when model_id changes.

_should_include_tool_result_status has the same string-match issue for application inference profiles. Lower impact, but I can address that in a follow-up if useful.

Related Issues

Closes #2233

Documentation PR

No docs change needed — the public API is unchanged.

Type of Change

Bug fix

Testing

Six new tests in test_bedrock.py:

  • Claude-backed profile resolves to "anthropic"; verifies the bedrock management client is used (not bedrock-runtime)
  • Non-Claude-backed profile returns None
  • GetInferenceProfile failure handled gracefully with no exception raised
  • GetInferenceProfile called only once per instance across multiple _cache_strategy accesses
  • Cached result cleared when update_config changes model_id
  • End-to-end: cachePoint injected into messages when profile resolves to Claude with strategy="auto"

163 bedrock tests pass. Full suite: 2873 passed, 4 skipped. hatch run prepare clean.

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Zelys-DFKH and others added 5 commits May 6, 2026 15:19
…nce profiles

CacheConfig(strategy="auto") silently skipped caching when model_id was an ARN-based
application inference profile (e.g. arn:aws:bedrock:...:application-inference-profile/...)
because the opaque profile ID contains neither "claude" nor "anthropic".

Fix: call GetInferenceProfile on first use to discover the underlying foundation model ARN,
then run the existing detection on that ARN. Result is cached on the instance to avoid
repeated API calls per request; invalidated when update_config changes the model_id.

Errors from GetInferenceProfile (e.g. missing IAM permission) are caught and logged at
DEBUG level so the caller can fall back to CacheConfig(strategy="anthropic") explicitly.

Closes strands-agents#2233

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
bedrock-runtime does not expose GetInferenceProfile; it is only available
on the bedrock management API. Store the boto session in __init__ so the
management client can be created on demand with the correct region and
credentials. Add a test that verifies the management service name is used.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ation_inference_profile_strategy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace fragile string-parsing with direct kwargs.get("service_name") check.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] CacheConfig(strategy="auto") does not detect Claude models when using ARN-based inference profiles

1 participant