[FEATURE] Support prompt caching with Application Inference Profiles in BedrockModel

### Problem Statement

Problem Statement

When using BedrockModel with an Application Inference Profile (AIP) ARN and cache_points enabled, caching is silently disabled because the SDK cannot determine the underlying model from the AIP ARN.

from strands.models import BedrockModel

model = BedrockModel(
    model_id="arn:aws:bedrock:eu-west-1:123456789012:application-inference-profile/my-profile-id",
    cache_points=["system", "tools"],
)
WARNING strands.models.bedrock:bedrock.py:449
model_id=<arn:aws:bedrock:eu-west-1:123456789012:application-inference-profile/my-profile-id>
| cache_config is enabled but this model does not support caching


The same configuration works correctly with a cross-region inference profile because the model identifier is embedded in the ARN:

model = BedrockModel(
    model_id="arn:aws:bedrock:us-east-1:123456789012:inference-profile/us.anthropic.claude-haiku-4-5-20251001-v1:0",
    cache_points=["system", "tools"],
)
Works — caching is active

Root Cause

The caching strategy implemented in PR #1438 (as part of #1432) requires identifying the model family before enabling caching. The model detection logic parses the model_id string to extract the model identifier. This works for base model IDs and cross-region inference profiles where the model name is visible in the ARN, but fails for Application Inference Profiles where the ARN contains only a custom profile identifier (e.g., application-inference-profile/my-profile-id).

### Proposed Solution

Proposed Solution

When the model_id is an Application Inference Profile ARN and cannot be resolved through string parsing, the SDK should call the Bedrock GetInferenceProfile API to resolve the underlying model:

import boto3

def _resolve_model_from_aip(aip_arn: str, region: str) -> str:
    """Resolve the underlying model from an Application Inference Profile ARN."""
    bedrock = boto3.client("bedrock", region_name=region)
    response = bedrock.get_inference_profile(inferenceProfileIdentifier=aip_arn)
    models = response.get("models", [])
    if models:
        return models[0].get("modelArn", "")
    return ""


The resolved model ARN would be used solely for the caching capability check. The original AIP ARN would continue to be passed as the modelId in Converse API calls to preserve cost attribution.

API reference: [[GetInferenceProfile](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetInferenceProfile.html)](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetInferenceProfile.html)

### Use Case

Why This Matters

Application Inference Profiles are used for granular cost attribution, per-team IAM access control, and per-profile quota management. They are essential for organizations that need to track inference costs by team, project, or customer. The current behavior forces users to choose between:

- AIP (cost tracking, access control) — but no prompt caching
- Cross-region inference profile (prompt caching works) — but no granular cost tracking

I have verified that the Amazon Bedrock Converse API fully supports prompt caching with AIP ARNs when cachePoint blocks are included directly in the request. The limitation is solely in the SDK's client-side model detection logic, which cannot extract the model family from an AIP ARN and therefore defaults to disabling caching.

### Alternatives Solutions

_No response_

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Support prompt caching with Application Inference Profiles in BedrockModel #1795

Problem Statement

Proposed Solution

Use Case

Alternatives Solutions

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Support prompt caching with Application Inference Profiles in BedrockModel #1795

Description

Problem Statement

Proposed Solution

Use Case

Alternatives Solutions

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions