Skip to content

Commit 33b4fcb

Browse files
Update SageMaker to SageMaker AI
1 parent c02a604 commit 33b4fcb

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ If you are unfamiliar with LiteLLM, it provides a consistent interface to access
3535
1. Tenants and client applications access the LiteLLM gateway proxy API through the [Amazon Route 53](https://aws.amazon.com/route53/) URL endpoint or [Amazon CloudFront](https://aws.amazon.com/cloudfront/) distribution, which is protected against common web exploits and bots using [AWS Web Application Firewall (WAF)](https://aws.amazon.com/waf/).
3636
2. AWS WAF forwards requests to [Application Load Balancer (ALB)](https://aws.amazon.com/elasticloadbalancing/application-load-balancer/)to automatically distribute incoming application traffic to [Amazon Elastic Container Service (ECS)](https://aws.amazon.com/ecs/) tasks or [Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/eks/) pods running generative AI gateway containers. TLS/SSL encryption secures traffic using a certificate issued by [AWS Certificate Manager (ACM)](https://aws.amazon.com/certificate-manager/).
3737
3. Container images for API/middleware and LiteLLM applications are built during guidance deployment and pushed to [Amazon Elastic Container registry (ECR)](http://aws.amazon.com/ecr/). They are used for deployment to Amazon ECS on AWS Fargate or Amazon EKS clusters that run these applications as containers in ECS tasks or EKS pods, respectively. LiteLLM provides a unified application interface for configuration and interacting with LLM providers. The API/middleware integrates natively with [Amazon Bedrock](https://aws.amazon.com/bedrock/) to enable features not supported by the [LiteLLM Open source project](https://docs.litellm.ai/).
38-
4. Models hosted on [Amazon Bedrock](https://aws.amazon.com/bedrock/) and [Amazon Nova](https://aws.amazon.com/ai/generative-ai/nova/) provide model access, guardrails, prompt caching, and routing to enhance the AI gateway and additional controls for clients through a unified API. Model access is also available for models deployed on [Amazon SageMaker](https://aws.amazon.com/sagemaker/). [Access to required Amazon Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) must be properly configured.
38+
4. Models hosted on [Amazon Bedrock](https://aws.amazon.com/bedrock/) and [Amazon Nova](https://aws.amazon.com/ai/generative-ai/nova/) provide model access, guardrails, prompt caching, and routing to enhance the AI gateway and additional controls for clients through a unified API. Model access is also available for models deployed on [Amazon SageMaker AI](https://aws.amazon.com/sagemaker-ai/). [Access to required Amazon Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) must be properly configured.
3939
5. External model providers (such as OpenAI, Anthropic, or Vertex AI) are configured using the LiteLLM Admin UI to enable additional model access through LiteLLM’s unified application interface. Integrate pre-existing configurations of third-party providers into the gateway using LiteLLM APIs.
4040
6. LiteLLM integrates with [Amazon ElastiCache (Redis OSS)](https://aws.amazon.com/elasticache/), [Amazon Relational Database Service (RDS)](https://aws.amazon.com/rds/), and [AWS Secrets Manager](https://aws.amazon.com/secrets-manager/) services. Amazon ElastiCache enables multi-tenant distribution of application settings and prompt caching. Amazon RDS enables persistence of virtual API keys and other configuration settings provided by LiteLLM. Secrets Manager stores external model provider credentials and other sensitive settings securely.
4141
7. LiteLLM and the API/middleware store application sends logs to the dedicated [Amazon S3](https://aws.amazon.com/s3) storage bucket for troubleshooting and access analysis.
@@ -192,7 +192,7 @@ This provides a robust defense against direct ALB access even if someone discove
192192
| **AWS Service** | **Role** | **Description** |
193193
| ------------------------------------------------------------------------------------------------------- | ------------------ | ----------------------------------------------------------------------------------------------------------- |
194194
| [Amazon Bedrock](https://aws.amazon.com/bedrock/) | Core service | Manages Single API access to multiple Foundational Models |
195-
| [Amazon SageMaker](https://aws.amazon.com/sagemaker/) | Core service | Manages access to any Foundational Model deployed on Amazon SageMaker |
195+
| [Amazon SageMaker AI](https://aws.amazon.com/sagemaker-ai/) | Core service | Manages access to any Foundational Model deployed on Amazon SageMaker AI |
196196
| [Amazon Elastic Container Service](https://aws.amazon.com/ecs/) ( ECS) | Core service | Manages application platform and on-demand infrastructure for LiteLLM container orchestration. |
197197
| [Amazon Elastic Kubernetes Service](https://aws.amazon.com/eks/) ( EKS) | Core service | Manages Kubernetes control plane and compute nodes for LiteLLM container orchestration. |
198198
| [Amazon Elastic Compute Cloud](https://aws.amazon.com/ec2/) (EC2) | Core service | Provides compute instances for EKS compute nodes and runs containerized applications. |
@@ -223,7 +223,7 @@ When implementing this guidance on AWS, it's important to understand the various
223223

224224
The total cost of running this solution can be broadly categorized into two main components:
225225

226-
1. **LLM Provider Costs**: These are the charges incurred for using services from LLM providers such as Amazon Bedrock, Amazon SageMaker, Anthropic, and others. Each provider has its own pricing model, typically based on factors like the number of tokens processed, model complexity, and usage volume.
226+
1. **LLM Provider Costs**: These are the charges incurred for using services from LLM providers such as Amazon Bedrock, Amazon SageMaker AI, Anthropic, and others. Each provider has its own pricing model, typically based on factors like the number of tokens processed, model complexity, and usage volume.
227227

228228
2. **AWS Infrastructure Costs**: These are the costs associated with running the Gen AI Gateway proxy server on AWS infrastructure. This includes various AWS services and resources used to host and operate the solution.
229229

0 commit comments

Comments
 (0)