Skip to content

Commit e2f0f7e

Browse files
authored
Merge branch 'aws-solutions-library-samples:main' into main
2 parents b55fc6c + 71c9013 commit e2f0f7e

File tree

143 files changed

+14524
-3810
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

143 files changed

+14524
-3810
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ build.toml
44
model.tar.gz
55
.checksum
66
.checksums/
7+
.build_checksum
8+
.lib_checksum
79
.vscode/
810
.DS_Store
911
dist/
@@ -20,3 +22,4 @@ rvl_cdip_*
2022
notebooks/examples/data
2123
.idea/
2224
.dsr/
25+
*tmp-dev-assets*

.gitlab-ci.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ developer_tests:
2828
- apt-get update -y
2929
- apt-get install make -y
3030
- pip install ruff
31+
# Install dependencies needed by publish.py for test imports
32+
- pip install typer rich boto3
3133
# Install test dependencies
3234
- cd lib/idp_common_pkg && pip install -e ".[test]" && cd ../..
3335

CHANGELOG.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,75 @@ SPDX-License-Identifier: MIT-0
55

66
## [Unreleased]
77

8+
## [0.3.14]
9+
810
### Added
11+
- Support for 1m token context for Claude Sonnet 4
12+
- Video demo of "Chat with Document" in [./docs/web-ui.md](./docs/web-ui.md)
13+
- **Human-in-the-Loop (HITL) Support Extended to Pattern-2**
14+
- Added HITL review capabilities for Pattern-2 (Textract + Bedrock processing) using Amazon SageMaker Augmented AI (A2I)
15+
- Enables human validation and correction when extraction confidence falls below configurable threshold
16+
- Includes same features as Pattern-1 HITL: automatic triggering, review portal integration, and seamless result updates
17+
- Documentation and video demo in [./docs/human-review.md](./docs/human-review.md)
18+
19+
### Removed
20+
- Windows development environment guide and setup script removed as it proved insufficiently robust
21+
22+
### Fixed
23+
- Fix 1-click Launch URL output from the GovCloud template generation script
24+
- Add Agent Analytics to architecture diagram
25+
- Fix various UX and error reporting issues with the new Python publish script
26+
- Simplify UDOP model path construction and avoid invalid default for regions other than us-east-1 and us-west-2
27+
- Permission regression from previous release affecting "Chat with Document"
28+
29+
30+
## [0.3.13]
31+
32+
### Added
33+
34+
- **External MCP Agent Integration for Custom Tool Extension**
35+
- Added External MCP (Model Context Protocol) Agent support that enables integration with custom MCP servers to extend IDP capabilities
36+
- **Cross-Account Integration**: Host MCP servers in separate AWS accounts or external infrastructure with secure OAuth authentication using AWS Cognito
37+
- **Dynamic Tool Discovery**: Automatically discovers and integrates available tools from MCP servers through the IDP web interface
38+
- **Secure Authentication Flow**: Uses AWS Cognito User Pools for OAuth bearer token authentication with proper token validation
39+
- **Configuration Management**: JSON array configuration in AWS Secrets Manager supporting multiple MCP server connections with optional custom agent names and descriptions
40+
- **Real-time Integration**: Tools become immediately available through the IDP web interface after configuration
41+
42+
- **AWS GovCloud Support with Automated Template Generation**
43+
- Added GovCloud compatibility through `scripts/generate_govcloud_template.py` script
44+
- **ARN Partition Compatibility**: All templates updated to use `arn:${AWS::Partition}:` for both commercial and GovCloud regions
45+
- **Headless Operation**: Automatically removes UI-related resources (CloudFront, AppSync, Cognito, WAF) for GovCloud deployment
46+
- **Core Functionality Preserved**: All 3 processing patterns and complete 6-step pipeline (OCR, Classification, Extraction, Assessment, Summarization, Evaluation) remain fully functional
47+
- **Automated Workflow**: Single script orchestrates build + GovCloud template generation + S3 upload with deployment URLs
48+
- **Enterprise Ready**: Enables headless document processing for government and enterprise environments requiring GovCloud compliance
49+
- **Documentation**: New `docs/govcloud-deployment.md` with deployment guide, architecture differences, and access methods
50+
51+
- **Pattern-2 and Pattern-3 Assessment now generate geometry (bounding boxes) for visualization in UI 'Visual Editor' (parity with Pattern-1)**
52+
- Added comprehensive spatial localization capabilities to both regular and granular assessment services
53+
- **Automatic Processing**: When LLM provides bbox coordinates, automatically converts to UI-compatible (Visual Edit) geometry format without any configuration
54+
- **Universal Support**: Works with all attribute types - simple attributes, nested group attributes (e.g., CompanyAddress.State), and list attributes
55+
- **Enhanced Prompts**: Updated assessment task prompts with spatial-localization-guidelines requesting bbox coordinates in normalized 0-1000 scale
56+
- **Demo Notebooks**: Assessment notebooks now showcase automatic bounding box processing
57+
58+
- **New Python-Based Publishing System**
59+
- Replaced `publish.sh` bash script with new `publish.py` Python script
60+
- Rich console interface with progress bars, spinners, and colored output using Rich library
61+
- Multi-threaded artifact building and uploading for significantly improved performance
62+
- Native support for Linux, macOS, and Windows environments
63+
64+
- **Windows Development Environment Setup Guide and Helper Script**
65+
- New `scripts/dev_setup.bat` (570 lines) for complete Windows development environment configuration
66+
67+
- **OCR Service Default Image Sizing for Resource Optimization**
68+
- Implemented automatic default image size limits (951×1268) when no image sizing configuration is provided
69+
- **Key Benefits**: Reduction in vision model token consumption, prevents OutOfMemory errors during concurrent processing, improves processing speed and reduces bandwidth usage
70+
71+
### Changed
72+
73+
- **Reverted to python3.12 runtime to resolve build package dependency problems**
74+
75+
### Fixed
76+
- **Improved Visual Edit bounding box position when using image zoom or pan**
977

1078

1179

Makefile

Lines changed: 29 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ test:
1414
$(MAKE) -C lib/idp_common_pkg test
1515

1616
# Run both linting and formatting in one command
17-
lint: ruff-lint format
17+
lint: ruff-lint format check-arn-partitions
1818

1919
# Run linting checks and fix issues automatically
2020
ruff-lint:
@@ -29,16 +29,39 @@ format:
2929
lint-cicd:
3030
@echo "Running code quality checks..."
3131
@if ! ruff check; then \
32-
echo "$(RED)ERROR: Ruff linting failed!$(NC)"; \
33-
echo "$(YELLOW)Please run 'make ruff-lint' locally to fix these issues.$(NC)"; \
32+
echo -e "$(RED)ERROR: Ruff linting failed!$(NC)"; \
33+
echo -e "$(YELLOW)Please run 'make ruff-lint' locally to fix these issues.$(NC)"; \
3434
exit 1; \
3535
fi
3636
@if ! ruff format --check; then \
37-
echo "$(RED)ERROR: Code formatting check failed!$(NC)"; \
38-
echo "$(YELLOW)Please run 'make format' locally to fix these issues.$(NC)"; \
37+
echo -e "$(RED)ERROR: Code formatting check failed!$(NC)"; \
38+
echo -e "$(YELLOW)Please run 'make format' locally to fix these issues.$(NC)"; \
39+
exit 1; \
40+
fi
41+
@echo -e "$(GREEN)All code quality checks passed!$(NC)"
42+
43+
# Check CloudFormation templates for hardcoded AWS partition ARNs
44+
check-arn-partitions:
45+
@echo "Checking CloudFormation templates for hardcoded ARN partitions..."
46+
@FOUND_ISSUES=0; \
47+
for template in template.yaml patterns/*/template.yaml patterns/*/sagemaker_classifier_endpoint.yaml options/*/template.yaml; do \
48+
if [ -f "$$template" ]; then \
49+
echo "Checking $$template..."; \
50+
MATCHES=$$(grep -n "arn:aws:" "$$template" | grep -v "arn:\$${AWS::Partition}:" || true); \
51+
if [ -n "$$MATCHES" ]; then \
52+
echo -e "$(RED)ERROR: Found hardcoded 'arn:aws:' references in $$template:$(NC)"; \
53+
echo "$$MATCHES" | sed 's/^/ /'; \
54+
echo -e "$(YELLOW) These should use 'arn:\$${AWS::Partition}:' instead for GovCloud compatibility$(NC)"; \
55+
FOUND_ISSUES=1; \
56+
fi; \
57+
fi; \
58+
done; \
59+
if [ $$FOUND_ISSUES -eq 0 ]; then \
60+
echo -e "$(GREEN)✅ No hardcoded ARN partition references found!$(NC)"; \
61+
else \
62+
echo -e "$(RED)❌ Found hardcoded ARN partition references that need to be fixed$(NC)"; \
3963
exit 1; \
4064
fi
41-
@echo "$(GREEN)All code quality checks passed!$(NC)"
4265

4366
# A convenience Makefile target that runs
4467
commit: lint test

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@ White-glove customization, deployment, and integration support for production us
3939
- **Cost Optimization**: Pay-per-use pricing model with built-in controls
4040
- **Comprehensive Monitoring**: Rich CloudWatch dashboard with detailed metrics and logs
4141
- **Web User Interface**: Modern UI for inspecting document workflow status and results
42+
- **Human-in-the-Loop (HITL)**: Amazon A2I integration for human review workflows (Pattern 1 & Pattern 2)
43+
- **Note**: When deploying multiple patterns with HITL, reuse existing private workteam ARN due to AWS account limits
4244
- **AI-Powered Evaluation**: Framework to assess accuracy against baseline data
4345
- **Extraction Confidence Assessment**: LLM-powered assessment of extraction confidence with multimodal document analysis
4446
- **Document Knowledge Base Query**: Ask questions about your processed documents
@@ -124,9 +126,11 @@ For detailed deployment and testing instructions, see the [Deployment Guide](./d
124126
- [Deployment](./docs/deployment.md) - Build, publish, deploy, and test instructions
125127
- [Web UI](./docs/web-ui.md) - Web interface features and usage
126128
- [Agent Analysis](./docs/agent-analysis.md) - Natural language analytics and data visualization feature
129+
- [Custom MCP Agent](./docs/custom-MCP-agent.md) - Integrating external MCP servers for custom tools and capabilities
127130
- [Configuration](./docs/configuration.md) - Configuration and customization options
128131
- [Classification](./docs/classification.md) - Customizing document classification
129132
- [Extraction](./docs/extraction.md) - Customizing information extraction
133+
- [Human-in-the-Loop Review](./docs/human-review.md) - Human review workflows with Amazon A2I
130134
- [Assessment](./docs/assessment.md) - Extraction confidence evaluation using LLMs
131135
- [Evaluation Framework](./docs/evaluation.md) - Accuracy assessment system with analytics database and reporting
132136
- [Knowledge Base](./docs/knowledge-base.md) - Document knowledge base query feature

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.3.12
1+
0.3.14

config_library/pattern-1/lending-package-sample/config.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,16 @@ pricing:
185185
price: '3.0E-7'
186186
- name: cacheWriteInputTokens
187187
price: '3.75E-6'
188+
- name: bedrock/us.anthropic.claude-sonnet-4-20250514-v1:0:1m
189+
units:
190+
- name: inputTokens
191+
price: '6.0E-6'
192+
- name: outputTokens
193+
price: '2.25E-5'
194+
- name: cacheReadInputTokens
195+
price: '6.0E-7'
196+
- name: cacheWriteInputTokens
197+
price: '7.5E-6'
188198
- name: bedrock/us.anthropic.claude-opus-4-20250514-v1:0
189199
units:
190200
- name: inputTokens

0 commit comments

Comments
 (0)