-
Notifications
You must be signed in to change notification settings - Fork 51
Dynamic Few-Shot Prompting with RAG and LLM using Amazon S3 Vectors #149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Dynamic Few-Shot Prompting with RAG and LLM using Amazon S3 Vectors #149
Conversation
523ca3f to
5393dd5
Compare
rstrahan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Daniel - Thanks so much.. I added a bunch of comments - please respond to each one, and we can meet/chat if it helps. Tx.
|
|
||
| ### Step 3: Populate the Examples Dataset | ||
|
|
||
| Use the [fewshot_dataset_import.ipynb](notebooks/fewshot_dataset_import.ipynb) notebook to import a dataset into S3 Vectors, or manually upload your example documents and metadata to the S3 bucket and vector index created by the stack. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is fcc_invoices_dataset_import.ipynb now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you have more guidance for user to prepare and load their own dataset?
|
|
||
| ## Quick Start | ||
|
|
||
| ### Step 1: Deploy the Dynamic-few shot Stack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
State prerequisites, since user must clone repo and have tools (sam, aws cli) deployed.
|
|
||
| ### Step 4: Configure IDP to Use Dynamic Few-Shot | ||
|
|
||
| Add the Lambda ARN to your IDP extraction configuration: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Show how to configure the Lambda ARN in the UI (screenshot)
| custom_prompt_lambda_arn: "arn:aws:lambda:region:account:function:GENAIIDP-dynamic-few-shot" | ||
| ``` | ||
|
|
||
| **Important**: Your extraction task prompt must include the `{FEW_SHOT_EXAMPLES}` placeholder where you want the dynamic examples to be inserted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to be more prescriptive here. How do we make this easy / foolproof?
Eg: Should all our extraction prompt examples contain the {FEW_SHOT_EXAMPLES} placeholder? Does it do no harm if there are no examples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Lambda, I currently have:
logger.warn("Missing {FEW_SHOT_EXAMPLES} placeholder in prompt template")
What I could do is to replace this by:
raise ValueError("Missing {FEW_SHOT_EXAMPLES} placeholder in prompt template")
This will stop the extraction process and the error will propagate to the UI if {FEW_SHOT_EXAMPLES} is missing.
As for adding {FEW_SHOT_EXAMPLES} to all templates - yes, I think that could be done. It should not do any harm, if there are no examples available, the placeholder will just be empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't think we should enforce the placeholder be there.. I'm just suggesting that we should add it to the default extraction task_prompt in our configs.
| reason: "Demo function - KMS CMK not required, but can be added by customer for production use cases" | ||
| # checkov:skip=CKV_AWS_158: "Demo function - KMS CMK not required, but can be added by customer for production use cases" | ||
| Properties: | ||
| VectorBucketName: !Ref VectorBucketName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above about bucketname..
- we should use customer provided existing bucket (if provided) OR create a new bucket with dynamic name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When running sam deploy --guided, the customer can choose a name. This is then stored in samconfig.toml as a parameter_overrides key/value pair. The above name is only a suggestion presented to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but your template will always create a new bucket, and will fail if the bucket name is already in use.
I'm suggesting that you allow user to provide an existing bucket (with index already in there) so that few shot example vectorstore can be shared by multiple IDP stacks, as a convenience. OR, they can leave buckename blank to have stack create one.
We should not require user to enter a bucketname for a new bucket, imho.. better to auto-gen the bucketname,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, this is implemented now - the user can leave VectorBucketName, VectorIndexName and DatasetBucketName blank and the stack will auto-create them. If provided, the stack will re-use existing resources.
| extraction: | ||
| dynamic_few_shot_lambda_arn: "${DynamicFewShotFunction.Arn}" | ||
| MonitoringLink: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice touch.. A CW Dashboard could be even better, to track number of inferences, successful matches, throttles/retries, etc.
|
|
||
| def _s3vectors_find_similar_items_from_image(page_image): | ||
| """Search for similar items using image query""" | ||
| embedding = bedrock_client.generate_embedding( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will incur token usage and costs
Can you minimally capture and log token use?
Even better, can you create metering data structure and return in lambda response.. and enhance extraction function to merge this metering into overall metering.. It can be a new 'context' (eg 'FewShotExamplesPlugin') - that way it will show up in the document Cost Estimates and Benchmark results for accurate cost transparency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/invoke_model.html used for generating embeddings does not seem to return token usage.
An approach could be to run a second call to https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/count_tokens.html to retrieve token usage. I will look into it.
…n Nova Multimodal Embeddings
…therwise bedrock client would always require PIL dependency
582b3cd to
2d630ad
Compare
Issue #43
Description of changes:
class BedrockClientto generate embeddings for Amazon Titan Multimodal Embeddings G1 and Amazon Nova Multimodal EmbeddingsBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.