Skip to content

Commit 6a71279

Browse files
committed
Merge branch 'fix/pattern-config' into 'develop'
Fix/pattern config See merge request genaiic-reusable-assets/engagement-artifacts/genaiic-idp-accelerator!231
2 parents 474b5a6 + e8aa292 commit 6a71279

File tree

31 files changed

+152
-1514
lines changed

31 files changed

+152
-1514
lines changed

CHANGELOG.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,12 @@ SPDX-License-Identifier: MIT-0
3232
- Backward compatibility maintained - old parameter pattern still supported with deprecation warning
3333
- Updated all lambda functions and notebooks to use new simplified pattern
3434
- Removed fixed image target_height and target_width from default configurations, so images are processed in original resolution by default.
35-
35+
- **Updated Default Configuration for Pattern1 and Pattern2**
36+
- Changed default configuration for new stacks from "default" to "lending-package-sample" for both Pattern1 and Pattern2
37+
- Maintains backward compatibility for stack updates by keeping the parameter value "default" mapped to the rvl-cdip-sample for pattern-2.
38+
- **Reduce assessment step costs**
39+
- Default model for granular assessment is now `us.amazon.nova-pro-v1:0`
40+
- Improved placement of <<CACHEPOINT>> tags in assessment prompt to improve utilization of prompt caching
3641

3742
### Fixed
3843
- **Fixed Image Resizing Behavior for High-Resolution Documents**
@@ -41,7 +46,7 @@ SPDX-License-Identifier: MIT-0
4146
- Fixed issue where PNG files were being unnecessarily converted to JPEG format and resized to lower resolution with lost quality
4247
- Fixed issue where PNG and JPG image files were not rendering inline in the Document Details page
4348
- Fixed issue where PDF files were being downloaded instead of displayed inline
44-
49+
- Fixed pricing data for cacheWrite tokens for Amazon Nova models to resolve innacurate cost estimation in UI.
4550

4651

4752
## [0.3.7]

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -73,8 +73,8 @@ After deployment, you can quickly process a document and view results:
7373
- **Via S3**: Upload directly to the S3 input bucket (find the bucket URL in CloudFormation stack Outputs)
7474

7575
2. **Use Sample Documents**:
76-
- For Pattern 1 (BDA): Use [samples/lending_package.pdf](./samples/lending_package.pdf)
77-
- For Patterns 2 and 3: Use [samples/rvl_cdip_package.pdf](./samples/rvl_cdip_package.pdf)
76+
- For Patterns 1 (BDA) and Pattern 2: Use [samples/lending_package.pdf](./samples/lending_package.pdf)
77+
- For Pattern 3 (UDOP): Use [samples/rvl_cdip_package.pdf](./samples/rvl_cdip_package.pdf)
7878

7979
3. **Monitor Processing**:
8080
- **Via Web UI**: Track document status on the dashboard
@@ -105,8 +105,8 @@ To update an existing GenAIIDP stack to a new version:
105105
7. For detailed instructions, see the [Deployment Guide](./docs/deployment.md#updating-an-existing-stack)
106106

107107
For testing, use these sample files:
108-
- Pattern-1 BDA default project: `samples/lending_package.pdf`
109-
- Patterns 2 and 3 default configurations: `samples/rvl_cdip_package.pdf`
108+
- For Patterns 1 (BDA) and Pattern 2: Use [samples/lending_package.pdf](./samples/lending_package.pdf)
109+
- For Pattern 3 (UDOP): Use [samples/rvl_cdip_package.pdf](./samples/rvl_cdip_package.pdf)
110110

111111
For detailed deployment and testing instructions, see the [Deployment Guide](./docs/deployment.md).
112112

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.3.8-wip6
1+
0.3.8-alpha

config_library/pattern-1/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,4 +27,4 @@ See the main [README.md](../README.md) for more detailed instructions on creatin
2727

2828
## Available Configurations
2929

30-
Currently, only the default configuration is available for Pattern 1. Contributions are welcome!
30+
Currently, only the default lending-package-sample configuration is available for Pattern 1. Contributions are welcome!

config_library/pattern-1/default/README.md renamed to config_library/pattern-1/lending-package-sample/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
22
SPDX-License-Identifier: MIT-0
33

4-
# Default Configuration
4+
# Default Configuration (lending-package-sample)
55

66
This directory contains the default configuration for the GenAI IDP Accelerator. This configuration serves as the baseline for all document processing tasks and can be used as a starting point for creating custom configurations.
77

config_library/pattern-1/default/config.yaml renamed to config_library/pattern-1/lending-package-sample/config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ pricing:
121121
- name: cacheReadInputTokens
122122
price: '1.5E-8'
123123
- name: cacheWriteInputTokens
124-
price: '0'
124+
price: '6.0E-8'
125125
- name: bedrock/us.amazon.nova-pro-v1:0
126126
units:
127127
- name: inputTokens
@@ -131,7 +131,7 @@ pricing:
131131
- name: cacheReadInputTokens
132132
price: '2.0E-7'
133133
- name: cacheWriteInputTokens
134-
price: '0'
134+
price: '8.0E-7'
135135
- name: bedrock/us.amazon.nova-premier-v1:0
136136
units:
137137
- name: inputTokens

config_library/pattern-1/default/samples/lending_package.pdf renamed to config_library/pattern-1/lending-package-sample/samples/lending_package.pdf

File renamed without changes.

config_library/pattern-2/bank-statement-sample/config.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -381,7 +381,7 @@ assessment:
381381
max_tokens: '10000'
382382
top_k: '5'
383383
temperature: '0.0'
384-
model: us.anthropic.claude-3-7-sonnet-20250219-v1:0
384+
model: us.amazon.nova-pro-v1:0
385385
system_prompt: >-
386386
You are a document analysis assessment expert. Your task is to evaluate the confidence of extraction results by analyzing the source document evidence. Respond only with JSON containing confidence scores for each extracted attribute.
387387
task_prompt: >-
@@ -579,7 +579,7 @@ pricing:
579579
- name: cacheReadInputTokens
580580
price: '1.5E-8'
581581
- name: cacheWriteInputTokens
582-
price: '0'
582+
price: '6.0E-8'
583583
- name: bedrock/us.amazon.nova-pro-v1:0
584584
units:
585585
- name: inputTokens
@@ -589,7 +589,7 @@ pricing:
589589
- name: cacheReadInputTokens
590590
price: '2.0E-7'
591591
- name: cacheWriteInputTokens
592-
price: '0'
592+
price: '8.0E-7'
593593
- name: bedrock/us.amazon.nova-premier-v1:0
594594
units:
595595
- name: inputTokens

config_library/pattern-2/lending-package-sample/config.yaml

Lines changed: 16 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
22
# SPDX-License-Identifier: MIT-0
33

4-
notes: Default settings
4+
notes: Default settings for lending-package-sample configuration
55
ocr:
66
backend: "textract" # Default to Textract for backward compatibility
77
model_id: "us.anthropic.claude-3-7-sonnet-20250219-v1:0"
@@ -12,8 +12,8 @@ ocr:
1212
- name: TABLES
1313
- name: SIGNATURES
1414
image:
15-
target_width: '951'
16-
target_height: '1268'
15+
target_width: ''
16+
target_height: ''
1717
classes:
1818
- name: Payslip
1919
description: >-
@@ -983,11 +983,11 @@ classification:
983983
DOCUMENT_IMAGE: Visual representation of the document page that provides layout, formatting, and visual structure information
984984
CLASS_NAMES_AND_DESCRIPTIONS: List of valid document types with their descriptions that the document must be classified into
985985
</variables>
986-
classificationMethod: textbasedHolisticClassification
986+
classificationMethod: multimodalPageLevelClassification
987987
extraction:
988988
image:
989-
target_width: '951'
990-
target_height: '1268'
989+
target_width: ''
990+
target_height: ''
991991
top_p: '0.1'
992992
max_tokens: '10000'
993993
top_k: '5'
@@ -1139,7 +1139,7 @@ summarization:
11391139
Do not include any text, explanations, or notes outside of this JSON
11401140
structure. The JSON must be properly formatted and parseable.
11411141
temperature: '0.0'
1142-
model: us.anthropic.claude-3-7-sonnet-20250219-v1:0
1142+
model: us.amazon.nova-premier-v1:0
11431143
system_prompt: >-
11441144
You are a document summarization expert who can analyze and summarize documents from various domains including medical, financial, legal, and general business documents. Your task is to create a summary that captures the key information, main points, and important details from the document. Your output must be in valid JSON format. \nSummarization Style: Balanced\\nCreate a balanced summary that provides a moderate level of detail. Include the main points and key supporting information, while maintaining the document's overall structure. Aim for a comprehensive yet concise summary.\n Your output MUST be in valid JSON format with markdown content. You MUST strictly adhere to the output format specified in the instructions.
11451145
assessment:
@@ -1156,7 +1156,7 @@ assessment:
11561156
max_tokens: '10000'
11571157
top_k: '5'
11581158
temperature: '0.0'
1159-
model: us.anthropic.claude-3-7-sonnet-20250219-v1:0
1159+
model: us.amazon.nova-pro-v1:0
11601160
system_prompt: >-
11611161
You are a document analysis assessment expert. Your task is to evaluate the confidence of extraction results by analyzing the source document evidence. Respond only with JSON containing confidence scores for each extracted attribute.
11621162
task_prompt: >-
@@ -1253,12 +1253,6 @@ assessment:
12531253
12541254
</final-instructions>
12551255
1256-
<attributes-definitions>
1257-
1258-
{ATTRIBUTE_NAMES_AND_DESCRIPTIONS}
1259-
1260-
</attributes-definitions>
1261-
12621256
<<CACHEPOINT>>
12631257
12641258
<document-image>
@@ -1267,7 +1261,6 @@ assessment:
12671261
12681262
</document-image>
12691263
1270-
12711264
<ocr-text-confidence-results>
12721265
12731266
{OCR_TEXT_CONFIDENCE}
@@ -1276,6 +1269,12 @@ assessment:
12761269
12771270
<<CACHEPOINT>>
12781271
1272+
<attributes-definitions>
1273+
1274+
{ATTRIBUTE_NAMES_AND_DESCRIPTIONS}
1275+
1276+
</attributes-definitions>
1277+
12791278
<extraction-results>
12801279
12811280
{EXTRACTION_RESULTS}
@@ -1353,7 +1352,7 @@ pricing:
13531352
- name: cacheReadInputTokens
13541353
price: '1.5E-8'
13551354
- name: cacheWriteInputTokens
1356-
price: '0'
1355+
price: '6.0E-8'
13571356
- name: bedrock/us.amazon.nova-pro-v1:0
13581357
units:
13591358
- name: inputTokens
@@ -1363,7 +1362,7 @@ pricing:
13631362
- name: cacheReadInputTokens
13641363
price: '2.0E-7'
13651364
- name: cacheWriteInputTokens
1366-
price: '0'
1365+
price: '8.0E-7'
13671366
- name: bedrock/us.amazon.nova-premier-v1:0
13681367
units:
13691368
- name: inputTokens

config_library/pattern-2/few_shot_example_with_multimodal_page_classification/README.md renamed to config_library/pattern-2/rvl-cdip-package-sample-with-few-shot-examples/README.md

File renamed without changes.

0 commit comments

Comments
 (0)