Skip to content

Add support to specify service_tier keyword for supported models #157

@khanrubd

Description

@khanrubd

Is your feature request related to a problem? Please describe.
Currently, token consumption is on-demand at the default tier, and users cannot specify flex (slower, cheaper), priority (faster, pricier) or provisioned tiers. The request is to add an option to specify service tiers.

Describe the solution you'd like
Ideally, end users should be able to select from the UI (or as cli option) the service tier they want to use. This can also be be turned into a deployment time selection, although that would run into issues around not all models being compatible with all service tiers.

Describe alternatives you've considered
Looked into whether this can be set at Bedrock service level or at AWS account level (e.g., all token consumption defaults to flex tier where available). Does not look like it is viable yet. Only docs I could find talk about limiting access to certain tiers etc. (https://docs.aws.amazon.com/bedrock/latest/userguide/security_iam_id-based-policy-examples-agent.html#security_iam_id-based-policy-examples-service-tiers)

Additional context
Amazon Bedrock runtime APIs now support specifying am optional service_tier keyword (https://aws.amazon.com/about-aws/whats-new/2025/11/amazon-bedrock-priority-flex-inference-service-tiers/). Specifically for batch mode IDP jobs, the "Flex Tier" (https://docs.aws.amazon.com/bedrock/latest/userguide/service-tiers-inference.html#w2aac21c29c11)" may be an attractive option, and it makes a real difference. For example, Nova 2 Pro at the default (standard) tier is pricier than Nova 1 Pro, but is cheaper at the Flex Tier. Likewise, Nova 2 Lite price varies by 2x (https://aws.amazon.com/nova/pricing/) across standard and flex tiers.

(So far I can tell, service_tier is an optional keyword for the Bedrock runtime SDKs / APIs, unsure about BDA though.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions