You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+77Lines changed: 77 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,83 @@ SPDX-License-Identifier: MIT-0
5
5
6
6
## [Unreleased]
7
7
8
+
### Added
9
+
10
+
11
+
12
+
## [0.3.12]
13
+
14
+
### Added
15
+
16
+
-**Custom Prompt Generator Lambda Support for Patterns 2 & 3**
17
+
- Added `custom_prompt_lambda_arn` configuration field to enable injection of custom business logic into extraction processing
18
+
-**Key Features**: Lambda interface with all template placeholders (DOCUMENT_TEXT, DOCUMENT_CLASS, ATTRIBUTE_NAMES_AND_DESCRIPTIONS, DOCUMENT_IMAGE), URI-based image handling for JSON serialization, comprehensive error handling with fail-fast behavior, scoped IAM permissions requiring GENAIIDP-* function naming
19
+
-**Use Cases**: Document type-specific processing rules, integration with external systems for customer configurations, conditional processing based on document content, regulatory compliance and industry-specific requirements
20
+
-**Demo Resources**: Interactive notebook demonstration (`step3_extraction_with_custom_lambda.ipynb`), SAM deployment template for demo Lambda function, comprehensive documentation and examples in `notebooks/examples/demo-lambda/`
21
+
-**Benefits**: Custom business logic without core code changes, backward compatible (existing deployments unchanged), robust JSON serialization handling all object types, complete observability with detailed logging
22
+
23
+
-**Refactored Document Classification Service for Enhanced Boundary Detection**
24
+
- Consolidated `multimodalPageLevelClassification` and the experimental `multimodalPageBoundaryClassification` (from v0.3.11) into a single enhanced `multimodalPageLevelClassification` method
25
+
- Implemented BIO-like sequence segmentation with document boundary indicators: "start" (new document) and "continue" (same document)
26
+
- Automatically segments multi-document packets, even when they contain multiple documents of the same type
27
+
- Added comprehensive classification guide with method comparisons and best practices
28
+
-**Benefits**: Simplified codebase with single multimodal classification method, improved handling of complex document packets, maintains backward compatibility
29
+
-**No Breaking Changes**: Existing configurations work unchanged, no configuration updates required
30
+
31
+
-**Enhanced A2I Template and Workflow Management**
32
+
- Enhanced A2I template with improved user interface and clearer instructions for reviewers
33
+
- Added comprehensive instructions for reviewers in A2I template to guide the review process
34
+
- Implemented capture of failed review tasks with proper error handling and logging
35
+
- Added workflow orchestration control to stop processing when reviewer rejects A2I task
36
+
- Removed automatic A2I task creation when Pattern-1 Bedrock Data Automation (BDA) fails to classify document to appropriate Blueprint
37
+
38
+
-**Dynamic Cost Calculation for Metering Data**
39
+
- Added automated unit cost and estimated cost calculation to metering table with new `unit_cost` and `estimated_cost` columns
40
+
- Dynamic pricing configuration loading from configuration
41
+
- Enhanced cost analysis capabilities with comprehensive Athena queries for cost tracking, trend analysis, and efficiency metrics
42
+
- Automatic cost calculation as `estimated_cost = value × unit_cost` for all metering records
43
+
44
+
-**Configuration-Based Summarization Control**
45
+
- Summarization can now be enabled/disabled via configuration file `summarization.enabled` property instead of CloudFormation stack parameter
46
+
-**Key Benefits**: Runtime control without stack redeployment, zero LLM costs when disabled, simplified state machine architecture, backward compatible defaults
47
+
-**Implementation**: Always calls SummarizationStep but service skips processing when `enabled: false`
48
+
-**Cost Optimization**: When disabled, no LLM API calls or S3 operations are performed
49
+
-**Configuration Example**: Set `summarization.enabled: false` to disable, `enabled: true` to enable (default)
50
+
51
+
-**Configuration-Based Assessment Control**
52
+
- Assessment can now be enabled/disabled via configuration file `assessment.enabled` property instead of CloudFormation stack parameter
53
+
-**Key Benefits**: Runtime control without stack redeployment, zero LLM costs when disabled, simplified state machine architecture, backward compatible defaults
54
+
-**Implementation**: Always calls AssessmentStep but service skips processing when `enabled: false`
55
+
-**Cost Optimization**: When disabled, no LLM API calls or S3 operations are performed
56
+
-**Configuration Example**: Set `assessment.enabled: false` to disable, `enabled: true` to enable (default)
57
+
58
+
-**New guides for setting up development environments**
59
+
- EC2-based Linux development environment
60
+
- MacOS development environment
61
+
62
+
### Removed
63
+
-**CloudFormation Parameters**: Removed `IsSummarizationEnabled` and `IsAssessmentEnabled` parameters from all pattern templates
64
+
-**Related Conditions**: Removed parameter conditions and state machine definition substitutions for both features
65
+
-**Conditional Logic**: Eliminated complex conditional logic from state machine definitions for summarization and assessment steps
66
+
67
+
### ⚠️ Breaking Changes
68
+
-**Configuration Migration Required**: When updating a stack that previously had `IsSummarizationEnabled` or `IsAssessmentEnabled` set to `false`, these features will now default to `enabled: true` after the update. To maintain the disabled behavior:
69
+
1. Update your configuration file to set `summarization.enabled: false` and/or `assessment.enabled: false` as needed
70
+
2. Save the configuration changes immediately after the stack update
71
+
3. This ensures continued cost optimization by preventing unexpected LLM API calls
72
+
-**Action Required**: Review your current CloudFormation parameter settings before updating and update your configuration accordingly to preserve existing behavior
73
+
74
+
### Changed
75
+
-**Updated Python Lambda Runtime to 3.13**
76
+
77
+
### Fixed
78
+
-**Fixed B615 "Unsafe Hugging Face Hub download without revision pinning" security finding in Pattern-3 fine-tuning module** - Added revision pinning with to prevent supply chain attacks and ensure reproducible deployments
79
+
-**Fixed CloudWatch Log Group Missing Retention regression**
80
+
-**Security: Cross-Site Scripting (XSS) Vulnerability in FileViewer Component** - Fixed high-risk XSS vulnerability in `src/ui/src/components/document-viewer/FileViewer.jsx` where `innerHTML` was used with user-controlled data
81
+
-**Add permissions boundary support to new Lambda function roles introduced in previous releases**
82
+
-**Fixed OutOfMemory Errors in Pattern-2 OCR Lambda for Large High-Resolution Documents**
83
+
-**Root Cause**: Processing large PDFs with high-resolution images (7469×9623 pixels) caused memory spikes when 20 concurrent workers each held ~101MB images simultaneously, exceeding the 4GB Lambda memory limit
84
+
-**Optimal Solution**: Refactored image extraction to render directly at target dimensions using PyMuPDF matrix transformations, completely eliminating oversized image creation
Copy file name to clipboardExpand all lines: config_library/pattern-2/bank-statement-sample/config.yaml
+2Lines changed: 2 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -307,6 +307,7 @@ extraction:
307
307
system_prompt: >-
308
308
You are a document assistant. Respond only with JSON. Never make up data, only provide data found in the document being provided.
309
309
summarization:
310
+
enabled: true
310
311
top_p: '0.1'
311
312
max_tokens: '4096'
312
313
top_k: '5'
@@ -368,6 +369,7 @@ summarization:
368
369
system_prompt: >-
369
370
You are a document summarization expert who can analyze and summarize documents from various domains including medical, financial, legal, and general business documents. Your task is to create a summary that captures the key information, main points, and important details from the document. Your output must be in valid JSON format. \nSummarization Style: Balanced\\nCreate a balanced summary that provides a moderate level of detail. Include the main points and key supporting information, while maintaining the document's overall structure. Aim for a comprehensive yet concise summary.\n Your output MUST be in valid JSON format with markdown content. You MUST strictly adhere to the output format specified in the instructions.
You are a multimodal document classification expert that analyzes business documents using both visual layout and textual content. Your task is to classify single-page documents into predefined categories based on their structural patterns, visual features, and text content. Your output must be valid JSON according to the requested format.
934
+
935
+
<variables>
936
+
<document-ocr-data>: OCR-extracted text content from the document page that provides textual information for classification
937
+
<document-image>: Visual representation of the document page that provides layout, formatting, and visual structure information
938
+
<document-types>: List of valid document types with their descriptions that the document must be classified into
939
+
</variables>
923
940
task_prompt: >-
924
941
<task-description>
925
-
Analyze the provided document using both its visual layout and textual content to determine its document type. You must classify it into exactly one of the predefined categories.
942
+
Analyze the provided document using both its visual layout and textual content to determine its document type and whether this page begins a new document or continues the previous one.
926
943
</task-description>
927
944
928
945
<document-types>
@@ -934,24 +951,16 @@ classification:
934
951
1. Examine the visual layout: headers, logos, formatting, structure, and visual organization
935
952
2. Analyze the textual content: key phrases, terminology, purpose, and information type
936
953
3. Identify distinctive features that match the document type descriptions
937
-
4. Consider both visual and textual evidence together to determine the best match
938
-
5. CRITICAL: Only use document types explicitly listed in the <document-types> section
954
+
4. Decide if this page starts a new document (output "start") or continues the previous document (output "continue")
955
+
5. Consider both visual and textual evidence together to determine the best match
956
+
6. CRITICAL: Only use document types explicitly listed in the <document-types> section
939
957
</classification-instructions>
940
958
941
-
<reasoning-guidelines>
942
-
When determining the document type:
943
-
- First identify the document's primary purpose and function
944
-
- Note specific visual elements (letterhead, forms, tables, signatures)
- Consider the document's intended audience and use case
947
-
- Provide specific evidence from both visual and textual analysis
948
-
</reasoning-guidelines>
949
-
950
959
<output-format>
951
-
Return your classification as valid JSON following this exact structure:
952
960
{
953
961
"classification_reason": "Detailed reasoning including specific visual and textual evidence that led to this classification",
954
-
"class": "exact_document_type_from_list"
962
+
"class": "exact_document_type_from_list",
963
+
"document_boundary": "start or continue"
955
964
}
956
965
</output-format>
957
966
@@ -968,22 +977,10 @@ classification:
968
977
<final-instructions>
969
978
Analyze the document above by:
970
979
1. Applying the <classification-instructions> to examine both visual and textual features
971
-
2. Following the <reasoning-guidelines> to build your classification rationale
972
-
3. Selecting ONLY from document types in <document-types>
973
-
4. Providing clear reasoning with specific evidence before the classification
974
-
5. Outputting in the exact JSON format specified in <output-format>
980
+
2. Selecting ONLY from document types in <document-types>
981
+
3. Providing clear reasoning with specific evidence
982
+
4. Outputting in the exact JSON format specified in <output-format>
975
983
</final-instructions>
976
-
temperature: '0.0'
977
-
model: us.amazon.nova-pro-v1:0
978
-
system_prompt: >-
979
-
You are a multimodal document classification expert that analyzes business documents using both visual layout and textual content. Your task is to classify single-page documents into predefined categories based on their structural patterns, visual features, and text content. Your output must be valid JSON according to the requested format.
980
-
981
-
<variables>
982
-
DOCUMENT_TEXT: OCR-extracted text content from the document page that provides textual information for classification
983
-
DOCUMENT_IMAGE: Visual representation of the document page that provides layout, formatting, and visual structure information
984
-
CLASS_NAMES_AND_DESCRIPTIONS: List of valid document types with their descriptions that the document must be classified into
You are a document assistant. Respond only with JSON. Never make up data, only provide data found in the document being provided.
1084
1081
summarization:
1082
+
enabled: true
1085
1083
top_p: '0.1'
1086
1084
max_tokens: '4096'
1087
1085
top_k: '5'
@@ -1143,6 +1141,7 @@ summarization:
1143
1141
system_prompt: >-
1144
1142
You are a document summarization expert who can analyze and summarize documents from various domains including medical, financial, legal, and general business documents. Your task is to create a summary that captures the key information, main points, and important details from the document. Your output must be in valid JSON format. \nSummarization Style: Balanced\\nCreate a balanced summary that provides a moderate level of detail. Include the main points and key supporting information, while maintaining the document's overall structure. Aim for a comprehensive yet concise summary.\n Your output MUST be in valid JSON format with markdown content. You MUST strictly adhere to the output format specified in the instructions.
0 commit comments