diff --git a/README.md b/README.md index f0ab977..09682fd 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,24 @@ # Agentic AI Foundation - Generative AI Customer Experience Platform -## Introduction - The CX Agent is an intelligent customer experience platform built on **LangGraph** and designed for deployment on **AWS Bedrock AgentCore Runtime**. This agentic AI solution leverages multiple generative AI foundations including LLM Gateway, observability and guardrails to deliver sophisticated customer service capabilities through a conversational interface. -![alt text](assets/agent01.png) +![](assets/agent01.png "Screenshot of Streamlit-based chat UI showing controls for configuring where the backend agent is hosted and to send feedback on AI responses received in the conversation.") + + +## Table of Contents + +1. [Overview](#overview) +1. [Foundational Components](#foundational-components) +1. [Prerequisites](#prerequisites) +1. [Deployment Steps](#deployment-steps) +1. [Deployment Validation](#deployment-validation) +1. [Running the Sample](#running-the-sample) +1. [Next Steps](#next-steps) + 1. [Local Development Workflow](#local-development-workflow) + 1. [Evaluation](#evaluation) +1. [Cleanup](#cleanup) + ## High Level Architecture @@ -13,14 +26,13 @@ For many of these AI platform capabilities, there are multiple alternative techn In this sample we've tried to choose tools that are popular with our customers, and keep the code simple (avoid introducing extra abstraction layers) - so switching where needed would require some effort but should be reasonably straightforward. The overall architecture is as shown below: -![alt text](assets/platform_arch.jpg) - +![](assets/platform_arch.jpg "Architecture overview diagram. Components include a (local) Streamlit application; Langfuse; Amazon Cognito; Amazon Bedrock AgentCore; Amazon Bedrock Guardrails; Amazon Bedrock Knowledge Base (backed by OpenSearch Serverless, loaded with data from Amazon S3); Amazon Bedrock Foundation Models called via a GenAI Gateway; Amazon CloudWatch for observability; and third-party external services including Tavily and Zendesk.") **A Sample end-2-end User Interaction Flow** -![End-2-end user flow](assets/sample_sequence_diagram.png) +![](assets/sample_sequence_diagram.png "Sequence diagram: Users log in and enter a question in the Streamlit frontend app, which raises a POST request to the '/invocations' endpoint of an AgentCore Runtime-deployed LangGraph agent. The agent starts a tracing span in Langfuse, then validates the input with Amazon Bedrock Guardrails before starting initial LLM processing of the prompt via the GenAI Gateway. The LLM returns tool call request(s) which the agent orchestrates via the Bedrock AgentCore Gateway to the relevant provider: Bedrock Knowledge Base, Tavily Web Search, Zendesk API, or AWS Lambda. After tool call(s), the Agent makes another LLM call to the GenAI Gateway to generate the final response.This final response is again checked with Amazon Bedrock Guardrails, and the agent ends the tracing span in Langfuse with metadata and metrics - before finally returning the response to the Streamlit app and thereby the user.") -## Generative AI Foundations +## Foundational Components Strong foundational or "platform" capabilities increase the speed and success rate of generative and agentic AI projects. This sample demonstrates a customer service agent integrating several of these capabilities together: @@ -37,7 +49,7 @@ Centralized model management and routing system that provides: Refer to the prerequisites section to deploy your Generative AI Gateway on AWS. -![alt text](assets/llmgateway01.png) +![](assets/llmgateway01.png "Screenshot of LiteLLM dashboard tracking capabilities, showing graphs of token usage over time; requests per day; spend per day; and successful vs failed requests over time.") ### Observability We combine [Amazon Bedrock AgentCore Observability](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability-configure.html) together with [Langfuse](https://langfuse.com/) (Open Source Edition deployed on AWS as shown [here](https://github.com/aws-samples/amazon-bedrock-samples/tree/main/evaluation-observe)), to collect and analyze detailed telemetry from the agent as it runs. This integration provides: @@ -49,7 +61,7 @@ We combine [Amazon Bedrock AgentCore Observability](https://docs.aws.amazon.com/ Refer to the prerequisites section to self-host your Langfuse platform. -![alt text](assets/langfuse01.png) +![](assets/langfuse01.png "Screenshot of Langfuse UI, showing a nested trace of a multi-step agentic answer generation process including LLM calls, tool calls, and logic steps. The overall parent span is selected, and a detail pane shows the overall input and output of the request as well as estimated cost, latency, and other metadata.") ### Guardrails We use [Amazon Bedrock Guardrails](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) to detect and intervene on issues with incoming user messages and outgoing agent responses before they're processed - including: @@ -59,75 +71,229 @@ We use [Amazon Bedrock Guardrails](https://docs.aws.amazon.com/bedrock/latest/us Configure your own Bedrock guardrail and apply it to your agentic application with this [workshop](https://catalog.workshops.aws/bedrockguard/en-US) -![alt text](assets/guardrail01.png) +![](assets/guardrail01.png "Screenshot of a chat showing the agent declining to provide investment advice when asked for it by the user.") + ## Prerequisites -Explore the README.md file under the **infra** directory for more details on the deployment. +### AWS Account Requirements +You'll need access to your target AWS account with broad/administrative permissions - including e.g. to create IAM roles and policies, and to deploy and manage all the types of AWS resource used in the architecture. -## Usage +### Separately-developed and Third-party tools -### Test locally +This sample connects with multiple separately-developed samples and solutions from AWS, as detailed in the deployment steps below. -1. **Initialize the project and dependencies**: -```bash -cd cx-agent-backend -uv venv -uv sync --all-extras --frozen +There are also integrations with a range of third-party services, but these are all **optional**. + +### Development Environment and Operating System + +These deployment instructions include shell commands that are optimized for **macOS or Linux**. In general, deployment should also be possible from Windows Subsystem for Linux, but hasn't been thoroughly tested. Proposals are welcome, for documentation or code updates to help with this! + +Your development environment will need: + - The [AWS CLI](https://aws.amazon.com/cli/) installed and [configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html) with your AWS access credentials + - [Terraform](https://developer.hashicorp.com/terraform/install) or [OpenTofu](https://opentofu.org/) installed + - (Recommended) [uv](https://docs.astral.sh/uv/getting-started/installation/) or (alternatively if you're comfortable managing virtual environments yourself) some other [Python](https://www.python.org/downloads/) environment + - Either [Docker Desktop](https://www.docker.com/products/docker-desktop/) or some other compatible container image build tool, like [Finch](https://runfinch.com/) + +> ℹ️ **Note:** We suggest using [mise-en-place](https://mise.jdx.dev/) to manage installation of Terraform and UV in case you need to work across projects using different versions of them (or many other tools). You can find our suggested versions in [mise.toml](./mise.toml). + + +## Deployment Steps + +In these instructions, sub-steps (2a, 2b, 2c, etc.) can generally be performed in parallel. + +### Step 1: Configuration + +To get started, copy the provided [infra/terraform.tfvars.example](infra/terraform.tfvars.example) skeleton file to [infra/terraform.tfvars](infra/terraform.tfvars). This will be the main place to configure your deployment. + +Review the *"core"* configurations and make changes as needed - but the default values should be good for most cases. + +For the *"external"* component configurations, see the sections below. + +Finally, note that in [infra/terraform.tf](infra/terraform.tf) we have configured Terraform's [backend state storage](https://developer.hashicorp.com/terraform/language/backend) to Amazon S3: + +```tf +terraform { + backend "s3" { + encrypt = true + key = "sample-agentic-ai-foundation.tfstate" + } +} ``` -2. **Run locally**: + +By default, you'll be prompted to provide an existing Amazon S3 Bucket name when deploying the Terraform solution later. You can [create a bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) to use for this purpose, or if you prefer to just use Terraform's [local filesystem backend](https://developer.hashicorp.com/terraform/language/backend) then simply delete or comment-out the `backend {...}` block. + + +### Step 2a: GenAI Model Gateway + +You'll need to deploy your AI model gateway for which we recommend using the LiteLLM-based [Guidance for a Multi-Provider Generative AI Gateway on AWS](https://aws.amazon.com/solutions/guidance/multi-provider-generative-ai-gateway-on-aws/). + +Once your gateway is deployed, log in to its admin UI and create an API key for the solution to invoke Foundation Models. + +In your [infra/terraform.tfvars](infra/terraform.tfvars) file, update: +- `gateway_url` as your Gateway URL (probably something like `https://{...}.cloudfront.net`) +- `gateway_api_key` as your API key (`sk-{...}`) + + +### Step 2b: (Optional) Langfuse for tracing + +If you'd like to self-host Langfuse Open Source Edition, refer to [this sample](https://github.com/awslabs/amazon-bedrock-agent-samples/tree/main/examples/agent_observability/deploy-langfuse-on-ecs-fargate-with-typescript-cdk) for guidance to deploy it on AWS. + +Alternatively, you could sign up to Langfuse's own [cloud-based service](https://cloud.langfuse.com/auth/sign-up) instead. + +Whether self-hosting or using Langfuse cloud, you'll also need to create a project and associated [API key pair](https://langfuse.com/faq/all/where-are-langfuse-api-keys) to store traces for the agent. + +In your [infra/terraform.tfvars](infra/terraform.tfvars) file, update: +- `langfuse_host` for where your Langfuse is hosted (will be like `https://{...}.cloudfront.net` if self-hosting Langfuse based on the above sample) +- `langfuse_public_key` as your public Langfuse API key (`pk-{...}`) +- `langfuse_secret_key` as your secret Langfuse API key (`sk-{...}`) + + +### Step 2c: (Optional) Tavily for web search + +To enable your agent to search the web, you could sign up to [Tavily](https://www.tavily.com/) and add your `tavily_api_key` to [infra/terraform.tfvars](infra/terraform.tfvars). + + +### Step 2d: (Optional) Zendesk ticketing + +If you have a Zendesk environment, you could enable your agent to log and manage tickets by setting up an [API token](https://support.zendesk.com/hc/en-us/articles/4408889192858-Managing-API-token-access-to-the-Zendesk-API). In [infra/terraform.tfvars](infra/terraform.tfvars), configure your `zendesk_domain`, `zendesk_email`, and `zendesk_api_token`. + + +### Step 3: Deploy the AI Foundation + +Once your tfvars configuration is set up and any external components you want to connect configured, you can deploy the core solution from your terminal via Terraform, as follows. + +First, open your terminal in the [infra/](./infra) folder: + +```sh +cd infra +``` + +Next, initialize Terraform: + +```sh +terraform init +``` + +> ℹ️ If you're prompted for an S3 Bucket here, refer back to **Step 1: Configuration** for guidance if needed. + +Once Terraform is initialized, you should be able to deploy the main solution infrastructure: + +```sh +terraform apply +``` + +The output values shown by Terraform after deployment will include information like the unique IDs of deployed user pool, Bedrock Guardrail, etc. + + +### Step 4a: Ingest Knowledge Base Documentation + +If you have internal documents your agent should be able to search over to help generate answers for users, You can ingest them into the created Amazon Bedrock Knowledge Base. + +First, copy your documents to the Amazon S3 data bucket deployed by the solution (using `terraform output` to look up attributes of the deployed infrastructure): + ```bash -uv run python -m cx_agent_backend +aws s3 cp your-documents/ s3://$(terraform output -raw s3_bucket_name)/ --recursive ``` -3. **Test the health endpoint**: + +Then once the documents are uploaded to S3, run the following command to sync the updates to your Knowledge Base: + ```bash -curl http://localhost:8080/ping +aws bedrock-agent start-ingestion-job \ + --knowledge-base-id $(terraform output -raw knowledge_base_id) \ + --data-source-id $(terraform output -raw data_source_id) ``` -4. **Test the agent endpoints**: + +This sync will run asynchronously and may take time to complete, especially for large corpora. You can check the progress of sync jobs via the [Amazon Bedrock Console](https://console.aws.amazon.com/bedrock/home?#/knowledge-bases) or the [GetIngestionJob API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_GetIngestionJob.html). Once it completes successfully, your documents should be visible to your agent in search results. + + +### Step 4b: Create a Cognito User + +To be able to talk to your deployed agent, you'll need to create a user for yourself in the deployed Amazon Cognito User Pool. You can do this through the [Amazon Cognito Console](https://console.aws.amazon.com/cognito/v2/idp/user-pools), or through the CLI if you prefer. + +A basic example CLI script to create yourself a user and set a non-temporary password is provided below. You'll need to set `COGNITO_USERNAME` to your email address, and `COGNITO_PASSWORD` to a conformant password (8+ characters, containing upper- and lower-case characters and numbers and special characters). + +> ⚠️ **Note:** These configurations are security-relevant, and the most appropriate settings will depend on your situation (applicable policies, whether you are generating credentials for yourself or somebody else, etc). +> +> In particular, the below commands are **not** recommended for setting up multiple users: Cognito can handle generating temporary credentials for you and sharing them privately to users' email addresses. For more detailed information, see the [Amazon Cognito developer guide](https://docs.aws.amazon.com/cognito/latest/developerguide/how-to-create-user-accounts.html). + ```bash -curl -X POST http://localhost:8080/api/v1/ \ - -H 'accept: application/json' \ - -H 'Content-Type: application/json' \ - -d '{"user_id": ""}' - -curl -X POST http://localhost:8080/invocations \ - -H "Content-Type: application/json" \ - -d '{"input": {"prompt": "Hello", "conversation_id": ""}}' - -curl -X POST http://localhost:8080/api/v1/invocations \ - -H "Content-Type: application/json" \ - -d '{ - "feedback": { - "run_id": "", - "session_id": "", - "score": 1.0, - "comment": "Great response!" - } - }' +# Edit these parameters: +COGNITO_USERNAME=TODO +COGNITO_PASSWORD=TODO + +# The temporary password hard-coded below will be overridden by the next cmd: +aws cognito-idp admin-create-user \ + --user-pool-id $(terraform output -raw user_pool_id) \ + --username $COGNITO_USERNAME \ + --temporary-password 'Day1Agentic!' + +# Set the permanent password: +aws cognito-idp admin-set-user-password \ + --user-pool-id $(terraform output -raw user_pool_id) \ + --username $COGNITO_USERNAME \ + --password $COGNITO_PASSWORD \ + --permanent +``` + + +## Deployment Validation + +Your Terraform apply in Deployment Step 3 should have completed successfully (usually with a message like "Apply complete!", depending on your CLI version). You can run `terraform apply` again to check this, or `terraform output` to see the deployed output parameters of your infrastructure. + + +## Running the Sample + +You can talk to your deployed agent either through the basic [chat_to_agentcore.ipynb](chat_to_agentcore.ipynb) Python notebook, or running the provided Streamlit UI application for a more fully-featured testing experience. + +### Python notebook +To test basic interaction with your agent through Python notebook, you'll first need to install the required libraries in your local Python environment: +```bash +cd cx-agent-backend +uv venv +uv sync --all-extras --frozen ``` -5. **Run the Streamlit app**: + +Then, you can open [chat_to_agentcore.ipynb](chat_to_agentcore.ipynb) using your installed Python virtual environment (`cx-agent-backend/.venv`) and follow the provided steps. + +### Streamlit app + +To run the UI app locally, first install the required libraries and then start the Streamlit server: + ```bash cd cx-agent-frontend uv venv uv sync --frozen uv run streamlit run src/app.py --server.port 8501 --server.address 127.0.0.1 ``` -6. Access the web interface at `http://localhost:8501` -### Bedrock AgentCore Deployment +Once the server is running, you can access the web interface in your preferred browser at http://localhost:8501 + + +## Next Steps + +So far we've focussed on the initial deployment and testing of the demo agent, but how would you use this sample to iteratively refine your own agents and evaluate how they're performing? + + +### Local Development Workflow -Refer to the **agentcore_runtime_deployment.ipynb** notebook to deploy your agent using [Bedrock AgentCore Runtime](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/agents-tools-runtime.html). +You can find more guidance on how to configure and run your agent locally in [cx-agent-backend/README.md](/cx-agent-backend/README.md). The Streamlit UI app is able to run against both local and AgentCore-deployed agents. -## Evaluation +Note that as configured by default, re-running `terraform apply` will replace your agent container image/tag in Amazon ECR, but will **not re-deploy** your AgentCore Runtime to consume the new container. You can trigger this re-deployment manually from the AWS Console for AgentCore Runtime via the "Update hosting" button. + +![](assets/agentcore-v2-deployed.png "Screenshot of Agent Runtime detail page in Bedrock AgentCore console, showing 'Update hosting' action button and a banner notice that version 2 of langgraph_cx_agent has been successfully deployed.") + + +### Evaluation The platform includes comprehensive evaluation capabilities to assess agent performance across multiple dimensions. ![alt text](assets/offline_eval.png) -### How Evaluation Works +#### How Evaluation Works The evaluation system runs test queries against your agent, collects execution traces, and measures performance: @@ -138,14 +304,14 @@ The evaluation system runs test queries against your agent, collects execution t 5. **Evaluate response quality** using Bedrock LLM to score faithfulness, correctness, and helpfulness 6. **Calculate performance metrics** and save comprehensive results to CSV files -### Evaluation Setup +#### Evaluation Setup The evaluation system consists of: - **offline_evaluation.py**: Main evaluation script that runs test queries and calculates metrics - **response_quality_evaluator.py**: Uses Bedrock LLM to evaluate response quality - **groundtruth.json**: Test queries with expected tool usage (create this file with your test cases) -### Prerequisites +#### Prerequisites 1. **Environment Variables**: Export Langfuse and AWS credentials: ```bash @@ -167,7 +333,7 @@ The evaluation system consists of: ] ``` -### Running Evaluation +#### Running Evaluation ```bash # Run offline evaluation @@ -177,7 +343,7 @@ python offline_evaluation.py python response_quality_evaluator.py ``` -### Metrics Collected +#### Metrics Collected - **Success Rate**: Percentage of successful agent responses - **Tool Accuracy**: How well the agent selects expected tools @@ -188,13 +354,13 @@ python response_quality_evaluator.py - **Helpfulness** (0.0-1.0): How useful and relevant the response is to answering the user's query - **Latency Metrics**: Total and per-tool response times -### Output Files +#### Output Files - **comprehensive_results.csv**: Complete evaluation results with all metrics - **trace_metrics.csv**: Raw trace data from Langfuse - **response_quality_scores.csv**: Detailed response quality evaluations -### Configuration +#### Configuration Set agent endpoint (local or AgentCore): ```bash @@ -205,6 +371,27 @@ export AGENT_ARN="http://localhost:8080" export AGENT_ARN="your-agentcore-endpoint" ``` +## Cleanup + +Once you're finished experimenting, destroy the cloud resources deployed for this solution to avoid incurring ongoing costs. + +Tear down the core solution infrastructure deployed by Terraform: + +```bash +terraform destroy +``` + +- You'll **likely** receive an error that your ECR repository can't be deleted because it still contains an image. You can manually delete the image(s) from the relevant repository in the [Amazon ECR Console](https://console.aws.amazon.com/ecr/repositories/), and then re-run `terraform destroy` again. +- You **may** see an error that your S3 bucket(s) for knowledge base (document) storage and access logs storage can't be deleted because they contain data. In this case, find and manually empty the relevant S3 buckets in the [Amazon S3 Console](https://console.aws.amazon.com/s3/home) before re-running `terraform destroy`. + +If you created an Amazon S3 bucket to store the Terraform backend state, then after the infrastructure is destroyed remember to also empty and delete that bucket. You can check your Terraform's backend configuration in the auto-generated `infra/backend.hcl` file. + +Once the core solution itself is deleted, review and clean up any of the external components you deployed, referring to their own documentation. Including: + +- The [GenAI Gateway Guidance](https://aws-solutions-library-samples.github.io/ai-ml/guidance-for-multi-provider-generative-ai-gateway-on-aws.html#uninstall-the-guidance) +- The [self-hosted Langfuse sample](https://github.com/aws-samples/amazon-bedrock-samples/tree/main/evaluation-observe/deploy-langfuse-on-ecs-fargate-with-typescript-cdk) - or if using Langfuse Cloud instead, review their [pricing guidance](https://langfuse.com/pricing) and consider cleaning up any unused resources. +- If you set up Tavily and/or Zendesk, refer to those providers' guidance on pricing and cleaning up or deleting your environment(s). + ## Security diff --git a/agentcore_runtime_deployment.ipynb b/agentcore_runtime_deployment.ipynb deleted file mode 100644 index 2e67379..0000000 --- a/agentcore_runtime_deployment.ipynb +++ /dev/null @@ -1,528 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# LangGraph Agent Deployment to Amazon Bedrock AgentCore\n", - "\n", - "This notebook demonstrates how to deploy a FastAPI LangGraph agent to Amazon Bedrock AgentCore Runtime and invoke it.\n", - "\n", - "It's tested to run in the Python virtual environment created in `cx-agent-backend`" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Prerequisites\n", - "\n", - "1. AWS CLI configured with appropriate permissions\n", - "2. Docker installed with ARM64 support\n", - "3. ECR repository created\n", - "4. Agent runtime IAM role created" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!pip install bedrock-agentcore" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import boto3\n", - "import uuid\n", - "import os" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Configuration" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Configuration\n", - "AWS_REGION = os.environ.get(\"AWS_REGION\", os.environ.get(\"AWS_DEFAULT_REGION\", \"us-east-1\"))\n", - "AWS_ACCOUNT_ID = boto3.client('sts').get_caller_identity()['Account']\n", - "ECR_REPOSITORY = \"langgraph-cx-agent\"\n", - "IMAGE_TAG = \"latest\"\n", - "ECR_URI = f\"{AWS_ACCOUNT_ID}.dkr.ecr.{AWS_REGION}.amazonaws.com/{ECR_REPOSITORY}:{IMAGE_TAG}\"\n", - "AGENT_RUNTIME_NAME = \"langgraph_cx_agent\"\n", - "AGENT_RUNTIME_ROLE_ARN = f\"arn:aws:iam::{AWS_ACCOUNT_ID}:role/agentic-ai-bedrock-role\" # deployed in the prerequisite section\n", - "USER_POOL_ID = \"\" # TODO: Enter Cognito User Pool ID from the terraform deployment output\n", - "CLIENT_ID = \"\" # TODO: Enter Cognito App Client ID from the terraform deployment output\n", - "CLIENT_SECRET = \"\" # TODO: Enter Cognito App Client Secret from the AWS console\n", - "\n", - "print(f\"Account ID: {AWS_ACCOUNT_ID}\")\n", - "print(f\"Region: {AWS_REGION}\")\n", - "print(f\"ECR URI: {ECR_URI}\")\n", - "print(f\"User Pool ID: {USER_POOL_ID}\")\n", - "\n", - "if not (USER_POOL_ID and CLIENT_ID and CLIENT_SECRET):\n", - " raise ValueError(\n", - " \"Please set USER_POOL_ID, CLIENT_ID, and CLIENT_SECRET above\"\n", - " )\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from bedrock_agentcore.memory import MemoryClient\n", - "from botocore.exceptions import ClientError" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "client = MemoryClient(region_name=AWS_REGION)\n", - "memory_name = \"CxMemory\"\n", - "memory_id = None" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "try:\n", - " print(\"Creating Memory...\")\n", - " # Create the memory resource\n", - " memory = client.create_memory_and_wait(\n", - " name=memory_name, # This name is unique across all memories in this account\n", - " description=\"Customer Service Agent\", # Human-readable description\n", - " strategies=[], # No memory strategies for short-term memory\n", - " event_expiry_days=7, # Memories expire after 7 days\n", - " max_wait=300, # Maximum time to wait for memory creation (5 minutes)\n", - " poll_interval=10 # Check status every 10 seconds\n", - " )\n", - "\n", - " # Extract and print the memory ID\n", - " memory_id = memory['id']\n", - " print(f\"Memory created successfully with ID: {memory_id}\")\n", - "except ClientError as e:\n", - " if e.response['Error']['Code'] == 'ValidationException' and \"already exists\" in str(e):\n", - " # If memory already exists, retrieve its ID\n", - " memories = client.list_memories()\n", - " memory_id = next((m['id'] for m in memories if m['id'].startswith(memory_name)), None)\n", - " print(f\"Memory already exists. Using existing memory ID: {memory_id}\")\n", - "except Exception as e:\n", - " # Handle any errors during memory creation\n", - " print(f\"❌ ERROR: {e}\")\n", - " import traceback\n", - " traceback.print_exc()\n", - " # Cleanup on error - delete the memory if it was partially created\n", - " if memory_id:\n", - " try:\n", - " client.delete_memory_and_wait(memory_id=memory_id)\n", - " print(f\"Cleaned up memory: {memory_id}\")\n", - " except Exception as cleanup_error:\n", - " print(f\"Failed to clean up memory: {cleanup_error}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create ECR repository to store your container image\n", - "import boto3\n", - "\n", - "ecr_client = boto3.client('ecr', region_name=AWS_REGION)\n", - "\n", - "try:\n", - " response = ecr_client.create_repository(repositoryName=ECR_REPOSITORY)\n", - " print(f\"Repository created: {response['repository']['repositoryUri']}\")\n", - "except ecr_client.exceptions.RepositoryAlreadyExistsException:\n", - " print(\"Repository already exists\")\n", - "except Exception as e:\n", - " print(f\"Error: {e}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Login to ECR\n", - "!aws ecr get-login-password --region {AWS_REGION} | docker login --username AWS --password-stdin {AWS_ACCOUNT_ID}.dkr.ecr.{AWS_REGION}.amazonaws.com\n", - "\n", - "# Alternatively with Finch:\n", - "#!aws ecr get-login-password --region {AWS_REGION} | finch login --username AWS --password-stdin {AWS_ACCOUNT_ID}.dkr.ecr.{AWS_REGION}.amazonaws.com" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Build and tag Docker image for ARM64\n", - "!cd cx-agent-backend && docker buildx build --platform linux/arm64 -t {ECR_URI} --push .\n", - "\n", - "# Alternatively with Finch:\n", - "#!cd cx-agent-backend && finch build --platform linux/arm64 -t {ECR_URI} . && finch push {ECR_URI}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 2: Create Agent Runtime" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create Bedrock AgentCore client\n", - "agentcore_client = boto3.client('bedrock-agentcore-control', region_name=AWS_REGION)\n", - "\n", - "# Create agent runtime\n", - "try:\n", - " response = agentcore_client.create_agent_runtime(\n", - " agentRuntimeName=AGENT_RUNTIME_NAME,\n", - " agentRuntimeArtifact={\n", - " 'containerConfiguration': {\n", - " 'containerUri': ECR_URI\n", - " }\n", - " },\n", - " networkConfiguration={\"networkMode\": \"PUBLIC\"},\n", - " roleArn=AGENT_RUNTIME_ROLE_ARN,\n", - " authorizerConfiguration={\n", - " 'customJWTAuthorizer': {\n", - " 'discoveryUrl': f'https://cognito-idp.{AWS_REGION}.amazonaws.com/{USER_POOL_ID}/.well-known/openid-configuration',\n", - " 'allowedClients': [\n", - " CLIENT_ID\n", - " ]\n", - " }\n", - " }\n", - " )\n", - " \n", - " agent_runtime_arn = response['agentRuntimeArn']\n", - " print(f\"Agent runtime created: {agent_runtime_arn}\")\n", - " \n", - "except Exception as e:\n", - " print(f\"Error creating agent runtime: {e}\")\n", - " # If already exists, get the ARN\n", - " try:\n", - " response = agentcore_client.get_agent_runtime(agentRuntimeName=AGENT_RUNTIME_NAME)\n", - " agent_runtime_arn = response['agentRuntimeArn']\n", - " print(f\"Using existing agent runtime: {agent_runtime_arn}\")\n", - " except Exception as e2:\n", - " print(f\"Error getting agent runtime: {e2}\")\n", - " raise" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 3: Retrieve Amazon Cognito Access Token for testing the agent" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import getpass\n", - "import os\n", - "\n", - "\n", - "def _set_if_undefined(var: str):\n", - " if not os.environ.get(var):\n", - " os.environ[var] = getpass.getpass(f\"Please provide your {var}\")\n", - "\n", - "\n", - "_set_if_undefined(\"EMAIL_USERNAME\")\n", - "_set_if_undefined(\"TEMPORARY_PASSWORD\")\n", - "_set_if_undefined(\"PERMANENT_PASSWORD\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Access values later in the notebook\n", - "username = os.environ['EMAIL_USERNAME']\n", - "temp_password = os.environ['TEMPORARY_PASSWORD']\n", - "permanent_password = os.environ['PERMANENT_PASSWORD']" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create test user\n", - "cognito_client = boto3.client('cognito-idp', region_name=AWS_REGION)\n", - "\n", - "try:\n", - " response = cognito_client.admin_create_user(\n", - " UserPoolId=USER_POOL_ID,\n", - " Username=username,\n", - " TemporaryPassword=temp_password,\n", - " MessageAction='SUPPRESS'\n", - " )\n", - " print(f\"User created: {response['User']['Username']}\")\n", - "except Exception as e:\n", - " print(f\"Error: {e}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Set permanent password\n", - "try:\n", - " response = cognito_client.admin_set_user_password(\n", - " UserPoolId=USER_POOL_ID,\n", - " Username=username,\n", - " Password=permanent_password,\n", - " Permanent=True\n", - " )\n", - " print(\"Password set successfully\")\n", - "except Exception as e:\n", - " print(f\"Error: {e}\")\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Calculate SECRET_HASH and store as variable\n", - "import hmac\n", - "import hashlib\n", - "import base64\n", - "import os\n", - "\n", - "def calculate_secret_hash(username, client_id, client_secret):\n", - " message = username + client_id\n", - " return base64.b64encode(\n", - " hmac.new(\n", - " client_secret.encode('utf-8'),\n", - " message.encode('utf-8'),\n", - " hashlib.sha256\n", - " ).digest()\n", - " ).decode('utf-8')\n", - "\n", - "\n", - "\n", - "SECRET_HASH = calculate_secret_hash(username, CLIENT_ID, CLIENT_SECRET)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Get access token\n", - "try:\n", - " response = cognito_client.initiate_auth(\n", - " ClientId=CLIENT_ID,\n", - " AuthFlow='USER_PASSWORD_AUTH',\n", - " AuthParameters={\n", - " 'USERNAME': username,\n", - " 'PASSWORD': permanent_password,\n", - " 'SECRET_HASH': SECRET_HASH\n", - " }\n", - " )\n", - " \n", - " access_token = response['AuthenticationResult']['AccessToken']\n", - " os.environ[\"COGNITO_TOKEN\"] = access_token\n", - " print(\"Access token fetched\")\n", - " \n", - "except Exception as e:\n", - " print(f\"Error: {e}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 4: Invoke Agent Runtime" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import requests\n", - "import urllib.parse\n", - "import json\n", - "\n", - "def invoke_agent(message, agent_arn, auth_token, session_id, region=AWS_REGION):\n", - " \"\"\"Invoke Bedrock AgentCore runtime with a message.\"\"\"\n", - " escaped_agent_arn = urllib.parse.quote(agent_arn, safe='')\n", - " url = f\"https://bedrock-agentcore.{region}.amazonaws.com/runtimes/{escaped_agent_arn}/invocations?qualifier=DEFAULT\"\n", - " \n", - " headers = {\n", - " \"Authorization\": f\"Bearer {auth_token}\",\n", - " \"Content-Type\": \"application/json\",\n", - " \"X-Amzn-Bedrock-AgentCore-Runtime-Session-Id\": session_id\n", - " }\n", - " \n", - " response = requests.post(url, headers=headers, data=json.dumps({\"input\": {\"prompt\": message, \"conversation_id\": \"12345\"}}), timeout=61)\n", - " \n", - " print(f\"Status Code: {response.status_code}\")\n", - " \n", - " if response.status_code == 200:\n", - " return response.json()\n", - " else:\n", - " print(f\"Error ({response.status_code}): {response.text}\")\n", - " return None" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 5: Test Different Scenarios" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "AUTH_TOKEN = os.environ[\"COGNITO_TOKEN\"] # (See above)\n", - "AGENT_ARN = agent_runtime_arn\n", - "SESSION_ID = str(uuid.uuid4())\n", - " \n", - "result = invoke_agent(\"Hello, can you help me resetting my router?\", AGENT_ARN, AUTH_TOKEN, SESSION_ID)\n", - "if result:\n", - " print(json.dumps(result, indent=2))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Test math functionality\n", - "math_test = \"What is 25 + 17?\"\n", - "print(f\"Testing math with: {math_test}\")\n", - "result = invoke_agent(math_test, AGENT_ARN, AUTH_TOKEN, SESSION_ID)\n", - "if result:\n", - " print(json.dumps(result, indent=2))\n", - "print(\"\\n\" + \"=\"*50 + \"\\n\")\n", - "\n", - "\n", - "result = invoke_agent(\"Add 10 to the result\", AGENT_ARN, AUTH_TOKEN, SESSION_ID)\n", - "if result:\n", - " print(json.dumps(result, indent=2))\n", - "print(\"\\n\" + \"=\"*50 + \"\\n\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Testing Memory Persistence" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = invoke_agent(\"What device did I want to reset ?\", AGENT_ARN, AUTH_TOKEN, SESSION_ID)\n", - "if result:\n", - " print(json.dumps(result, indent=2))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Step 6: Cleanup (Optional)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Uncomment to delete the agent runtime\n", - "# AGENT_RUNTIME_NAME = \"\"\n", - "# try:\n", - "# agentcore_client.delete_agent_runtime(agentRuntimeId=AGENT_RUNTIME_NAME)\n", - "# print(f\"Agent runtime {AGENT_RUNTIME_NAME} deleted\")\n", - "# except Exception as e:\n", - "# print(f\"Error deleting agent runtime: {e}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Uncomment to delete memory\n", - "# client.delete_memory_and_wait(memory_id = \"memory_id\", max_wait = 300, poll_interval =10)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.8" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/chat_to_agentcore.ipynb b/chat_to_agentcore.ipynb new file mode 100644 index 0000000..5d29d79 --- /dev/null +++ b/chat_to_agentcore.ipynb @@ -0,0 +1,361 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "71db7e7e", + "metadata": {}, + "source": [ + "# Test your agent from a Python Notebook\n", + "\n", + "This interactive notebook provides an alternative way to test your deployed agent, besides the Streamlit UI app." + ] + }, + { + "cell_type": "markdown", + "id": "cbced266", + "metadata": {}, + "source": [ + "## Kernel selection and prerequisites\n", + "\n", + "You can use the `cx-agent-backend/.venv` as a kernel. If this is not set up already, run the following from your terminal:\n", + "\n", + "```bash\n", + "cd cx-agent-backend\n", + "uv venv\n", + "uv sync --all-extras --frozen\n", + "```\n", + "\n", + "This notebook assumes:\n", + "1. You've already deployed the main solution as described in [README.md](./README.md)\n", + "2. Your Python kernel is already configured with AWS credentials and target AWS Region (for example via environment variables, potentially set via a `.env` file as documented [here for VSCode](https://code.visualstudio.com/docs/python/environments#_environment-variables)).\n", + " - Note that the AWS SDK for Python, `boto3`, [expects](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#using-environment-variables) an `AWS_DEFAULT_REGION` environment variable rather than `AWS_REGION`." + ] + }, + { + "cell_type": "markdown", + "id": "b51a5349", + "metadata": {}, + "source": [ + "## Dependencies and setup\n", + "\n", + "First we'll import the necessary libraries, and initialize clients for AWS Services, and define some utility functions that'll be used later:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "93b97307", + "metadata": {}, + "outputs": [], + "source": [ + "# Python Built-Ins:\n", + "import base64\n", + "import json\n", + "import getpass\n", + "import hashlib\n", + "import hmac\n", + "import os\n", + "import uuid\n", + "import secrets\n", + "import string\n", + "import urllib.parse\n", + "\n", + "# External Libraries:\n", + "import boto3 # AWS SDK for Python\n", + "import requests # For making raw HTTP(S) API calls\n", + "\n", + "# AWS Service Clients:\n", + "botosess = boto3.Session() # You could set `region_name` here explicitly if wanted\n", + "cognito_client = botosess.client(\"cognito-idp\") # Cognito (Identity Provider)\n", + "\n", + "\n", + "def _set_if_undefined(var: str, name: str | None = None) -> str:\n", + " \"\"\"Utility to prompt user once for a value, and cache it in environment variable\"\"\"\n", + " if not os.environ.get(var):\n", + " os.environ[var] = getpass.getpass(f\"Please provide your {name or var}:\")\n", + " return os.environ[var]\n", + "\n", + "\n", + "def calculate_secret_hash(username, client_id, client_secret):\n", + " \"\"\"Utility to hash a username + client ID + client secret for Cognito login\"\"\"\n", + " message = username + client_id\n", + " return base64.b64encode(\n", + " hmac.new(\n", + " client_secret.encode(\"utf-8\"),\n", + " message.encode(\"utf-8\"),\n", + " hashlib.sha256\n", + " ).digest()\n", + " ).decode(\"utf-8\")\n", + "\n", + "\n", + "def invoke_agent(\n", + " message,\n", + " agent_arn,\n", + " auth_token,\n", + " session_id,\n", + " qualifier=\"DEFAULT\",\n", + " region=botosess.region_name,\n", + "):\n", + " \"\"\"Invoke Bedrock AgentCore runtime with a message.\"\"\"\n", + " escaped_agent_arn = urllib.parse.quote(agent_arn, safe='')\n", + " response = requests.post(\n", + " f\"https://bedrock-agentcore.{region}.amazonaws.com/runtimes/{escaped_agent_arn}/invocations?qualifier={qualifier}\",\n", + " headers={\n", + " \"Authorization\": f\"Bearer {auth_token}\",\n", + " \"Content-Type\": \"application/json\",\n", + " \"X-Amzn-Bedrock-AgentCore-Runtime-Session-Id\": session_id\n", + " },\n", + " data=json.dumps({\"input\": {\"prompt\": message, \"conversation_id\": session_id}}),\n", + " timeout=61,\n", + " )\n", + " \n", + " print(f\"Status Code: {response.status_code}\")\n", + " \n", + " if response.status_code == 200:\n", + " return response.json()\n", + " else:\n", + " raise ValueError(f\"HTTP {response.status_code}: {response.text}\")" + ] + }, + { + "cell_type": "markdown", + "id": "b1f027bf", + "metadata": {}, + "source": [ + "## Fetch access token from Amazon Cognito\n", + "\n", + "To talk to the AgentCore-deployed agent, we'll need to log in to Amazon Cognito to fetch a session token.\n", + "\n", + "You'll need to fetch your Cognito user_pool_id and client_id in the cell below, which you can view by running the `terraform output` command in your terminal:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "09e8aa2d", + "metadata": {}, + "outputs": [], + "source": [ + "user_pool_id = TODO # E.g. run `terraform output -raw user_pool_id`\n", + "client_id = TODO # E.g. run `terraform output -raw client_id`\n", + "\n", + "# From these we should be able to look up the client secret automatically:\n", + "client_secret = cognito_client.describe_user_pool_client(\n", + " UserPoolId=user_pool_id,\n", + " ClientId=client_id\n", + ")[\"UserPoolClient\"][\"ClientSecret\"]" + ] + }, + { + "cell_type": "markdown", + "id": "66238615", + "metadata": {}, + "source": [ + "The next cell will prompt you for your Cognito username (email address) and password, or re-use the existing one if you run the cell again without restarting the notebook:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "00938dba", + "metadata": {}, + "outputs": [], + "source": [ + "username = _set_if_undefined(\"COGNITO_USERNAME\", \"Cognito user name (email address)\")\n", + "password = _set_if_undefined(\"COGNITO_PASSWORD\", \"Cognito password\")" + ] + }, + { + "cell_type": "markdown", + "id": "e00e6221", + "metadata": {}, + "source": [ + "The deployment steps in [README.md](./README.md) guide you through setting up your Cognito user from the AWS CLI, but you could instead un-comment and run the below to achieve the same effect from Python:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fde3f16b", + "metadata": {}, + "outputs": [], + "source": [ + "## Create a user (with temporary password)\n", + "# create_user_resp = cognito_client.admin_create_user(\n", + "# UserPoolId=user_pool_id,\n", + "# Username=username,\n", + "# # Temp password is randomized here because we'll never use it:\n", + "# TemporaryPassword=\"\".join((\n", + "# secrets.choice(\n", + "# string.ascii_uppercase + string.ascii_lowercase + string.digits +\n", + "# \"^$*.[]{}()?-'\\\"!@#%&/\\\\,><':;|_~`+=\"\n", + "# ) for i in range(20)\n", + "# )),\n", + "# MessageAction=\"SUPPRESS\"\n", + "# )\n", + "# print(f\"User created: {create_user_resp['User']['Username']}\")\n", + "\n", + "## Override the password to the given value (permanently)\n", + "# set_password_resp = cognito_client.admin_set_user_password(\n", + "# UserPoolId=user_pool_id,\n", + "# Username=username,\n", + "# Password=password,\n", + "# Permanent=True,\n", + "# )\n", + "# print(\"Password set successfully\")" + ] + }, + { + "cell_type": "markdown", + "id": "ac6d3aff", + "metadata": {}, + "source": [ + "With the configuration set up, we're ready to request an access token from Cognito:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "33858093", + "metadata": {}, + "outputs": [], + "source": [ + "auth_resp = cognito_client.initiate_auth(\n", + " ClientId=client_id,\n", + " AuthFlow=\"USER_PASSWORD_AUTH\",\n", + " AuthParameters={\n", + " \"USERNAME\": username,\n", + " \"PASSWORD\": password,\n", + " \"SECRET_HASH\": calculate_secret_hash(username, client_id, client_secret),\n", + " }\n", + ")\n", + "\n", + "access_token = auth_resp[\"AuthenticationResult\"][\"AccessToken\"]\n", + "print(\"Access token fetched\")" + ] + }, + { + "cell_type": "markdown", + "id": "05e0c2e6", + "metadata": {}, + "source": [ + "## Invoke the agent\n", + "\n", + "With the access token ready, we're almost ready to invoke our AgentCore Agent. First though, you'll need to:\n", + "1. Look up the deployed AgentRuntime ARN from the terraform, and\n", + "2. Choose a session ID (we'll randomize this automatically)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6754ec78", + "metadata": {}, + "outputs": [], + "source": [ + "agent_arn = TODO # E.g. run `terraform output -raw agent_runtime_arn`\n", + "\n", + "session_id = str(uuid.uuid4()) # Can auto-generate this" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9cf235a1", + "metadata": {}, + "outputs": [], + "source": [ + "result = invoke_agent(\"Hello, can you help me resetting my router?\", agent_arn, access_token, session_id)\n", + "if result:\n", + " print(json.dumps(result, indent=2))" + ] + }, + { + "cell_type": "markdown", + "id": "583c6b54", + "metadata": {}, + "source": [ + "### Testing math functionality" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "60914206", + "metadata": {}, + "outputs": [], + "source": [ + "math_test = \"What is 25 + 17?\"\n", + "print(f\"Testing math with: {math_test}\")\n", + "result = invoke_agent(math_test, agent_arn, access_token, session_id)\n", + "if result:\n", + " print(json.dumps(result, indent=2))\n", + "print(\"\\n\" + \"=\"*50 + \"\\n\")" + ] + }, + { + "cell_type": "markdown", + "id": "ceac8013", + "metadata": {}, + "source": [ + "### Testing memory persistence" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "68de9444", + "metadata": {}, + "outputs": [], + "source": [ + "message = \"Add 10 to the result\"\n", + "print(message)\n", + "result = invoke_agent(message, agent_arn, access_token, session_id)\n", + "if result:\n", + " print(json.dumps(result, indent=2))\n", + "print(\"\\n\" + \"=\"*50 + \"\\n\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5caf85c9", + "metadata": {}, + "outputs": [], + "source": [ + "result = invoke_agent(\"What device did I want to reset?\", agent_arn, access_token, session_id)\n", + "if result:\n", + " print(json.dumps(result, indent=2))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "646cef68", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.8" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/infra/README.md b/infra/README.md deleted file mode 100644 index 4312238..0000000 --- a/infra/README.md +++ /dev/null @@ -1,75 +0,0 @@ - -***NOTE: Deploying the LLM Gateway and Langfuse platform would be needed to incorporate centralized model management and observability.*** - - -Before we get started, we would need to deploy the following: -- A Generative AI Gateway to be able to invoke multi-provider models -- An observability platform with Langfuse to collect and analyze detailed telemetry from the agent as it runs -- Optionally retrieve web search and support ticket API keys -- A knowledge base with the help of [Amazon Bedrock Knowledge Base](https://aws.amazon.com/bedrock/knowledge-bases/) -- A guardrail with the help of [Amazon Bedrock Guardrails]([https://aws.amazon.com/bedrock/guardrails/](url)) -- Set-up cognito authentication -- Create and store keys and secrets - - -### Multi Provider Generative AI Gateway Deployment - -Deploy a multi-provider gateway on AWS by referring to this [guidance](https://aws.amazon.com/solutions/guidance/multi-provider-generative-ai-gateway-on-aws/). - -### Observability Platform with Langfuse - -If you would like to self-host your own Langfuse platform, refer to this [guidance](https://github.com/awslabs/amazon-bedrock-agent-samples/tree/main/examples/agent_observability/deploy-langfuse-on-ecs-fargate-with-typescript-cdk). You could also use the [cloud](https://cloud.langfuse.com/auth/sign-up) version instead. - -### Infrastructure Deployment - -Deploy all infrastructure components using the unified Terraform stack: - -```bash -# Navigate to infrastructure directory -cd infra - -# Copy and customize the variables file -cp terraform.tfvars.example terraform.tfvars -# Edit terraform.tfvars and replace placeholder values with your actual values - -# Initialize and deploy all components -terraform init -terraform plan -terraform apply -``` - -This will deploy: -- Bedrock AgentCore IAM Role with required permissions -- Knowledge Base stack (S3 bucket, OpenSearch Serverless, Knowledge Base) -- Bedrock Guardrails for content filtering -- Cognito User Pool for authentication -- SSM Parameters for configuration -- Secrets Manager secrets for API keys - -### Configuration - -After deployment, the following outputs will be available: -- `bedrock_role_arn`: IAM role ARN for Bedrock agents -- `knowledge_base_id`: Knowledge base ID for document retrieval -- `data_source_id`: Data source ID for knowledge base -- `guardrail_id`: Guardrail ID for content filtering -- `user_pool_id`: Cognito user pool ID -- `client_id`: Cognito client ID -- `client_secret`: Cognito client secret (sensitive) - -All configuration values are automatically stored in AWS Systems Manager Parameter Store and Secrets Manager. - - -**Upload your documents to the S3 bucket**: -```bash -# Use bucket name from Terraform output -aws s3 cp your-documents/ s3://$(terraform output -raw s3_bucket_name)/ --recursive -``` - -**Trigger knowledge base ingestion**: -```bash -# Use IDs from Terraform outputs -aws bedrock-agent start-ingestion-job \ - --knowledge-base-id $(terraform output -raw knowledge_base_id) \ - --data-source-id $(terraform output -raw data_source_id) -``` diff --git a/infra/main.tf b/infra/main.tf index 5e3f2d9..726a88a 100644 --- a/infra/main.tf +++ b/infra/main.tf @@ -1,9 +1,26 @@ +# Agent Container Image +module "container_image" { + source = "./modules/container-image" + + force_image_rebuild = var.force_image_rebuild + image_build_tool = var.container_image_build_tool + repository_name = "langgraph-cx-agent" +} + +# Agent Memory +resource "aws_bedrockagentcore_memory" "agent_memory" { + name = "CxMemory" + event_expiry_duration = 30 +} + # Bedrock Agent Role module "bedrock_role" { - source = "./modules/agentcore-iam-role" - role_name = var.bedrock_role_name - knowledge_base_id = module.kb_stack.knowledge_base_id - guardrail_id = module.guardrail.guardrail_id + source = "./modules/agentcore-iam-role" + agent_memory_arn = aws_bedrockagentcore_memory.agent_memory.arn + container_repository_arn = module.container_image.ecr_repository_arn + role_name = var.bedrock_role_name + knowledge_base_id = module.kb_stack.knowledge_base_id + guardrail_id = module.guardrail.guardrail_id } # Knowledge Base Stack @@ -35,7 +52,7 @@ module "parameters" { guardrail_id = module.guardrail.guardrail_id user_pool_id = module.cognito.user_pool_id client_id = module.cognito.user_pool_client_id - ac_stm_memory_id = var.ac_stm_memory_id + ac_stm_memory_id = aws_bedrockagentcore_memory.agent_memory.id depends_on = [ module.kb_stack, @@ -64,3 +81,26 @@ module "secrets" { depends_on = [module.cognito] } +# Deploy the endpoint +resource "aws_bedrockagentcore_agent_runtime" "agent_runtime" { + agent_runtime_name = "langgraph_cx_agent" + description = "Example customer service agent for Agentic AI Foundation" + role_arn = module.bedrock_role.role_arn + authorizer_configuration { + custom_jwt_authorizer { + discovery_url = module.cognito.user_pool_discovery_url + allowed_clients = [module.cognito.user_pool_client_id] + } + } + agent_runtime_artifact { + container_configuration { + container_uri = module.container_image.ecr_image_uri + } + } + network_configuration { + network_mode = "PUBLIC" + } + protocol_configuration { + server_protocol = "HTTP" + } +} diff --git a/infra/modules/agentcore-iam-role/bedrock-agentcore-policy.tf b/infra/modules/agentcore-iam-role/bedrock-agentcore-policy.tf index 05b515d..4d63665 100644 --- a/infra/modules/agentcore-iam-role/bedrock-agentcore-policy.tf +++ b/infra/modules/agentcore-iam-role/bedrock-agentcore-policy.tf @@ -34,9 +34,13 @@ resource "aws_iam_policy" "ecr_permissions" { "ecr:BatchGetImage", "ecr:GetDownloadUrlForLayer" ] - Resource = [ - "arn:aws:ecr:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:repository/*" - ] + Resource = ( + var.container_repository_arn == "" ? + [ + "arn:aws:ecr:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:repository/*" + ] : + [var.container_repository_arn] + ) }, { Sid = "ECRTokenAccess" @@ -144,6 +148,35 @@ resource "aws_iam_policy" "agentcore_permissions" { "arn:aws:bedrock-agentcore:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:workload-identity-directory/default", "arn:aws:bedrock-agentcore:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:workload-identity-directory/default/workload-identity/*" ] + }, + { + Sid = "AccessMemory" + Effect = "Allow" + Action = [ + "bedrock-agentcore:BatchCreateMemoryRecords", + "bedrock-agentcore:BatchDeleteMemoryRecords", + "bedrock-agentcore:BatchUpdateMemoryRecords", + "bedrock-agentcore:CreateEvent", + "bedrock-agentcore:DeleteEvent", + "bedrock-agentcore:DeleteMemoryRecord", + "bedrock-agentcore:GetEvent", + "bedrock-agentcore:GetMemory", + "bedrock-agentcore:GetMemoryRecord", + "bedrock-agentcore:ListActors", + "bedrock-agentcore:ListEvents", + "bedrock-agentcore:ListMemoryRecords", + "bedrock-agentcore:ListSessions", + "bedrock-agentcore:ListTagsForResource", + "bedrock-agentcore:RetrieveMemoryRecords", + "bedrock-agentcore:TagResource", + ] + Resource = ( + var.agent_memory_arn == "" ? + [ + "arn:aws:bedrock-agentcore:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:memory/*" + ] : + [var.agent_memory_arn] + ) } ] }) diff --git a/infra/modules/agentcore-iam-role/variables.tf b/infra/modules/agentcore-iam-role/variables.tf index 02051d6..40e7501 100644 --- a/infra/modules/agentcore-iam-role/variables.tf +++ b/infra/modules/agentcore-iam-role/variables.tf @@ -3,6 +3,12 @@ variable "role_name" { type = string } +variable "container_repository_arn" { + description = "ARN of specific Amazon ECR repository to grant access (default: all)" + default = "" + type = string +} + variable "knowledge_base_id" { description = "Knowledge Base ID to restrict access to" type = string @@ -13,4 +19,10 @@ variable "guardrail_id" { description = "Guardrail ID to restrict access to" type = string default = "*" +} + +variable "agent_memory_arn" { + description = "ARN of specific AgentCore Memory to grant access (default: all)" + default = "" + type = string } \ No newline at end of file diff --git a/infra/modules/cognito/main.tf b/infra/modules/cognito/main.tf index a577ade..789116a 100644 --- a/infra/modules/cognito/main.tf +++ b/infra/modules/cognito/main.tf @@ -1,3 +1,5 @@ +data "aws_region" "current" {} + resource "aws_cognito_user_pool" "user_pool" { name = var.user_pool_name diff --git a/infra/modules/cognito/outputs.tf b/infra/modules/cognito/outputs.tf index 3c1205b..c563a7f 100644 --- a/infra/modules/cognito/outputs.tf +++ b/infra/modules/cognito/outputs.tf @@ -8,6 +8,10 @@ output "user_pool_arn" { value = aws_cognito_user_pool.user_pool.arn } +output "user_pool_discovery_url" { + value = "https://cognito-idp.${data.aws_region.current.name}.amazonaws.com/${aws_cognito_user_pool.user_pool.id}/.well-known/openid-configuration" +} + output "user_pool_client_id" { description = "ID of the Cognito User Pool Client" value = aws_cognito_user_pool_client.user_pool_client.id diff --git a/infra/modules/container-image/main.tf b/infra/modules/container-image/main.tf new file mode 100644 index 0000000..a552e98 --- /dev/null +++ b/infra/modules/container-image/main.tf @@ -0,0 +1,40 @@ +data "aws_caller_identity" "current" {} +data "aws_region" "current" {} + +locals { + image_src_path = "${path.root}/${var.relative_image_src_path}" + image_src_hash = sha512( + join( + "", + # TODO: Find a way to exclude .venv, dist, and potentially other subfolders: + [for f in fileset(".", "${local.image_src_path}/**") : filesha512(f)] + ) + ) + + image_build_extra_args = "--platform linux/arm64" + image_build_push_cmd = <<-EOT + aws ecr get-login-password | ${var.image_build_tool} login --username AWS \ + --password-stdin ${aws_ecr_repository.ecr_repository.repository_url} && + ${var.image_build_tool} build ${local.image_build_extra_args} \ + -t ${aws_ecr_repository.ecr_repository.repository_url}:${var.image_tag} \ + ${local.image_src_path} && + ${var.image_build_tool} push ${aws_ecr_repository.ecr_repository.repository_url}:${var.image_tag} + EOT +} + +resource "aws_ecr_repository" "ecr_repository" { + name = var.repository_name +} + +resource "terraform_data" "ecr_image" { + triggers_replace = [ + aws_ecr_repository.ecr_repository.id, + var.force_image_rebuild == true ? timestamp() : local.image_src_hash + ] + + input = "${aws_ecr_repository.ecr_repository.repository_url}:${var.image_tag}" + + provisioner "local-exec" { + command = local.image_build_push_cmd + } +} diff --git a/infra/modules/container-image/outputs.tf b/infra/modules/container-image/outputs.tf new file mode 100644 index 0000000..cc39cc7 --- /dev/null +++ b/infra/modules/container-image/outputs.tf @@ -0,0 +1,14 @@ +output "ecr_repository_arn" { + description = "ARN of the Amazon ECR repository for the agent container image" + value = aws_ecr_repository.ecr_repository.arn +} + +output "ecr_repository_uri" { + description = "URI of the Amazon ECR repository for the agent container image" + value = aws_ecr_repository.ecr_repository.repository_url +} + +output "ecr_image_uri" { + description = "URI of the Amazon ECR repository for the agent container image" + value = terraform_data.ecr_image.output +} diff --git a/infra/modules/container-image/variables.tf b/infra/modules/container-image/variables.tf new file mode 100644 index 0000000..bccbaa7 --- /dev/null +++ b/infra/modules/container-image/variables.tf @@ -0,0 +1,28 @@ +variable "force_image_rebuild" { + description = "Set true to force rebuild & push of image to ECR even if source appears unchanged" + default = false + type = bool +} + +variable "image_build_tool" { + description = "Either 'docker' or a Docker-compatible alternative e.g. 'finch'" + default = "docker" + type = string +} + +variable "relative_image_src_path" { + description = "Path to container image source folder, relative to Terraform root" + default = "../cx-agent-backend" + type = string +} + +variable "image_tag" { + description = "Tag to apply to the pushed container image in Amazon ECR" + default = "latest" + type = string +} + +variable "repository_name" { + description = "Name of the Amazon ECR repository to create and deploy the image to" + type = string +} diff --git a/infra/modules/opensearch-serverless/versions.tf b/infra/modules/opensearch-serverless/versions.tf index efa94bf..1b5c2e0 100644 --- a/infra/modules/opensearch-serverless/versions.tf +++ b/infra/modules/opensearch-serverless/versions.tf @@ -6,7 +6,7 @@ terraform { } aws = { source = "hashicorp/aws" - version = "~> 5.0" + version = ">= 5.0" } awscc = { source = "hashicorp/awscc" diff --git a/infra/outputs.tf b/infra/outputs.tf index 10b27bb..ce38c9b 100644 --- a/infra/outputs.tf +++ b/infra/outputs.tf @@ -1,3 +1,8 @@ +output "agent_runtime_arn" { + description = "ARN of deployed AgentCore Runtime" + value = aws_bedrockagentcore_agent_runtime.agent_runtime.agent_runtime_arn +} + output "bedrock_role_arn" { description = "ARN of the Bedrock agent role" value = module.bedrock_role.role_arn diff --git a/infra/terraform.tf b/infra/terraform.tf new file mode 100644 index 0000000..61510a1 --- /dev/null +++ b/infra/terraform.tf @@ -0,0 +1,15 @@ +terraform { + backend "s3" { + encrypt = true + key = "sample-agentic-ai-foundation.tfstate" + # TODO: Can we enable use_lockfile = true ? + } + + required_providers { + aws = { + source = "hashicorp/aws" + # v6.18 added support for Bedrock AgentCore Memory + version = ">= 6.18" + } + } +} diff --git a/infra/terraform.tfvars.example b/infra/terraform.tfvars.example index 3bb246e..441a8fb 100644 --- a/infra/terraform.tfvars.example +++ b/infra/terraform.tfvars.example @@ -1,29 +1,37 @@ -# Bedrock Role Variables -bedrock_role_name = "agentic-ai-bedrock-role" +###### CORE solution configurations +#### This section includes configurations for the Agentic AI Foundation itself -# Cognito Variables -user_pool_name = "agentic-ai-user-pool" +## Container build +## Uncomment the below line if you use 'finch' instead of Docker: +# container_image_build_tool = "finch" +## Naming AWS resources to be deployed +# IAM execution role used by the agent +bedrock_role_name = "agentic-ai-bedrock-role" +# Cognito User Pool +user_pool_name = "agentic-ai-user-pool" # Knowledge Base Stack Variables kb_stack_name = "agentic-ai-kb" kb_bucket_name = "agentic-ai-kb-bucket" -# Langfuse Variables -langfuse_host = "https://cloud.langfuse.com" -langfuse_public_key = "your-langfuse-public-key" -langfuse_secret_key = "your-langfuse-secret-key" +###### EXTERNAL component configurations +#### This section configures separately-deployed components of the platform -# LLM Gateway Variables +## GenAI Model Gateway gateway_url = "https://your-gateway-url.example.com" gateway_api_key = "your-gateway-api-key" +## (Optional) Langfuse - tracing and observability +langfuse_host = "https://cloud.langfuse.com" +langfuse_public_key = "your-langfuse-public-key" +langfuse_secret_key = "your-langfuse-secret-key" + +## (Optional) Tavily - web search for agents # Tavily (Optional, for agent web search tool) # tavily_api_key = "your-tavily-api-key" +## (Optional) Zendesk - ticketing # Zendesk (Optional, for agent ticketing tool) # zendesk_domain = "your-subdomain" # zendesk_email = "your-email@example.com" # zendesk_api_token = "your-zendesk-api-token" - -# Memory Variables -# stm_memory_id = "your-stm-memory-id" \ No newline at end of file diff --git a/infra/variables.tf b/infra/variables.tf index 3b1286d..5b1235f 100644 --- a/infra/variables.tf +++ b/infra/variables.tf @@ -1,3 +1,16 @@ +# Container Image Variables +variable "force_image_rebuild" { + description = "Set true to force rebuild+push of container image even if source seems unchanged" + default = false + type = bool +} + +variable "container_image_build_tool" { + description = "Either 'docker' or a Docker-compatible alternative e.g. 'finch'" + default = "docker" + type = string +} + # Bedrock Role Variables variable "bedrock_role_name" { description = "Name of the Bedrock agent role" @@ -82,8 +95,3 @@ variable "tavily_api_key" { type = string sensitive = true } - -variable "ac_stm_memory_id" { - description = "ID of the AC STM resource" - type = string -} \ No newline at end of file