fix(bedrock): resolve cache strategy for ARN-based application inference profiles#2257
Open
Zelys-DFKH wants to merge 5 commits intostrands-agents:mainfrom
Open
Conversation
…nce profiles CacheConfig(strategy="auto") silently skipped caching when model_id was an ARN-based application inference profile (e.g. arn:aws:bedrock:...:application-inference-profile/...) because the opaque profile ID contains neither "claude" nor "anthropic". Fix: call GetInferenceProfile on first use to discover the underlying foundation model ARN, then run the existing detection on that ARN. Result is cached on the instance to avoid repeated API calls per request; invalidated when update_config changes the model_id. Errors from GetInferenceProfile (e.g. missing IAM permission) are caught and logged at DEBUG level so the caller can fall back to CacheConfig(strategy="anthropic") explicitly. Closes strands-agents#2233 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
bedrock-runtime does not expose GetInferenceProfile; it is only available on the bedrock management API. Store the boto session in __init__ so the management client can be created on demand with the correct region and credentials. Add a test that verifies the management service name is used. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ation_inference_profile_strategy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace fragile string-parsing with direct kwargs.get("service_name") check.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
CacheConfig(strategy="auto")silently disabled caching for ARN-based application inference profiles likearn:aws:bedrock:us-east-1:123456:application-inference-profile/abc123. Application inference profiles use opaque resource IDs that don't contain"claude"or"anthropic", so_cache_strategy's string-match returnsNoneand logs a spurious warning, even when the profile is backed by Claude Sonnet.The fix detects the
application-inference-profileARN pattern and callsGetInferenceProfileon the Bedrock management API to find the underlying foundation model. That model ARN does contain"anthropic"for Claude-backed profiles, so the existing string-match works correctly from there.Two implementation notes:
GetInferenceProfileis on thebedrockmanagement client, notbedrock-runtime. The fix stores the boto session in__init__so the management client can be created on demand using the same credentials and region. It requires thebedrock:GetInferenceProfileIAM permission; on error it falls back toNoneand logs a debug message pointing toCacheConfig(strategy="anthropic")as an explicit workaround.update_configwhenmodel_idchanges._should_include_tool_result_statushas the same string-match issue for application inference profiles. Lower impact, but I can address that in a follow-up if useful.Related Issues
Closes #2233
Documentation PR
No docs change needed — the public API is unchanged.
Type of Change
Bug fix
Testing
Six new tests in
test_bedrock.py:"anthropic"; verifies thebedrockmanagement client is used (notbedrock-runtime)NoneGetInferenceProfilefailure handled gracefully with no exception raisedGetInferenceProfilecalled only once per instance across multiple_cache_strategyaccessesupdate_configchangesmodel_idcachePointinjected into messages when profile resolves to Claude withstrategy="auto"163 bedrock tests pass. Full suite: 2873 passed, 4 skipped.
hatch run prepareclean.hatch run prepareChecklist