|
| 1 | +Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. |
| 2 | +SPDX-License-Identifier: MIT-0 |
| 3 | + |
| 4 | +# Default-Lending Configuration |
| 5 | + |
| 6 | +This directory contains the default-lending configuration for the GenAI IDP Accelerator. This configuration is specifically designed for processing lending and financial document packages commonly used in loan applications, underwriting, and financial verification processes. |
| 7 | + |
| 8 | +## Pattern Association |
| 9 | + |
| 10 | +**Pattern**: Pattern-2 - Uses Textract or Amazon Bedrock models for both page classification/grouping and information extraction |
| 11 | + |
| 12 | +## Validation Level |
| 13 | + |
| 14 | +**Level**: 2 - Comprehensive Testing |
| 15 | + |
| 16 | +- **Testing Evidence**: This configuration has been tested with lending sample document including payslips, driver's licenses, bank statements, checks, W2 forms, and insurance applications. It demonstrates robust performance in classifying and extracting detailed financial information from standard lending documents. |
| 17 | +- **Known Limitations**: Performance may vary with non-standard document formats, heavily redacted financial documents, or documents with poor image quality that affect OCR accuracy. |
| 18 | + |
| 19 | +## Overview |
| 20 | + |
| 21 | +The default-lending configuration is designed to handle comprehensive lending document packages typically encountered in: |
| 22 | + |
| 23 | +- **Loan Applications**: Personal and commercial lending |
| 24 | +- **Mortgage Processing**: Home loan documentation |
| 25 | +- **Credit Assessment**: Income and asset verification |
| 26 | +- **Underwriting**: Risk assessment documentation |
| 27 | +- **Compliance Verification**: Financial record validation |
| 28 | + |
| 29 | +It includes specialized settings for document classification, detailed financial information extraction, and document summarization using Amazon Bedrock models optimized for financial document processing. |
| 30 | + |
| 31 | +## Key Components |
| 32 | + |
| 33 | +### Document Classes |
| 34 | + |
| 35 | +The configuration defines 6 specialized lending document classes, each with comprehensive attributes for detailed financial data extraction: |
| 36 | + |
| 37 | +- **Payslip**: Employee wage statements with detailed earnings, deductions, taxes, and year-to-date totals (21 simple attributes, 3 group attributes, 3 list attributes) |
| 38 | +- **US-drivers-licenses**: Government-issued identification documents with personal information and driving privileges (7 simple attributes, 3 group attributes, 2 list attributes) |
| 39 | +- **Bank-checks**: Written financial instruments with payment details and account information (11 simple attributes) |
| 40 | +- **Bank-Statement**: Periodic financial reports with account activity and transaction details (8 simple attributes, 2 list attributes) |
| 41 | +- **W2**: Annual tax documents with comprehensive wage and tax withholding information (2 simple attributes, 5 group attributes, 2 list attributes) |
| 42 | +- **Homeowners-Insurance-Application**: Insurance coverage applications with detailed applicant and property information (10 simple attributes, 3 group attributes) |
| 43 | + |
| 44 | +### Classification Settings |
| 45 | + |
| 46 | +- **Model**: Amazon Nova Pro |
| 47 | +- **Method**: Text-based holistic classification |
| 48 | +- **Temperature**: 0 (deterministic outputs) |
| 49 | +- **Top-k**: 5 |
| 50 | +- **OCR Backend**: Amazon Textract with LAYOUT, TABLES, and SIGNATURES features |
| 51 | +- **OCR Model**: Amazon Claude 3.7 Sonnet for enhanced text extraction and layout understanding |
| 52 | + |
| 53 | +The classification component analyzes document content and structure to accurately identify lending document types and establish proper page boundaries within multi-document packages. |
| 54 | + |
| 55 | +### Extraction Settings |
| 56 | + |
| 57 | +- **Model**: Amazon Nova Pro |
| 58 | +- **Temperature**: 0 (deterministic outputs) |
| 59 | +- **Top-k**: 5 |
| 60 | +- **Max Tokens**: 10,000 (increased for detailed financial data) |
| 61 | + |
| 62 | +The extraction component performs comprehensive attribute extraction tailored to each lending document type, capturing critical financial information including: |
| 63 | +- Detailed income and deduction breakdowns |
| 64 | +- Personal identification information |
| 65 | +- Account numbers and financial institution details |
| 66 | +- Tax withholding and year-to-date totals |
| 67 | +- Insurance coverage details and applicant information |
| 68 | + |
| 69 | +### Assessment Settings |
| 70 | + |
| 71 | +- **Model**: Amazon Claude 3.7 Sonnet |
| 72 | +- **Granular Assessment**: Enabled with parallel processing |
| 73 | +- **Default Confidence Threshold**: 0.9 |
| 74 | +- **Max Workers**: 20 for improved performance |
| 75 | + |
| 76 | +Enhanced confidence assessment ensures high accuracy for financial data extraction, critical for lending decisions. |
| 77 | + |
| 78 | +### Summarization Settings |
| 79 | + |
| 80 | +- **Model**: Amazon Claude 3.7 Sonnet |
| 81 | +- **Temperature**: 0 (deterministic outputs) |
| 82 | +- **Top-k**: 5 |
| 83 | + |
| 84 | +The summarization component creates structured summaries of lending documents with proper citations, essential for loan documentation and compliance. |
| 85 | + |
| 86 | +## Sample Documents |
| 87 | + |
| 88 | +This configuration is optimized for processing lending document packages that typically include: |
| 89 | + |
| 90 | +- **Income Verification**: Payslips, W2 forms, tax returns |
| 91 | +- **Identity Verification**: Driver's licenses, state IDs |
| 92 | +- **Asset Verification**: Bank statements, investment accounts |
| 93 | +- **Payment History**: Bank checks, payment records |
| 94 | +- **Insurance Documentation**: Homeowner's insurance applications and policies |
| 95 | + |
| 96 | +## How to Use |
| 97 | + |
| 98 | +To use this default-lending configuration: |
| 99 | + |
| 100 | +1. **Direct Deployment**: Deploy the GenAI IDP Accelerator with this configuration for lending document processing workflows: |
| 101 | + ```bash |
| 102 | + # Deploy with lending configuration |
| 103 | + ./deploy.sh --config config_library/pattern-2/default-lending/config.yaml |
| 104 | + ``` |
| 105 | + |
| 106 | +2. **Loan Processing Integration**: Integrate with existing loan origination systems for automated document processing and data extraction. |
| 107 | + |
| 108 | +3. **Compliance Workflows**: Use for regulatory compliance documentation and audit trail generation. |
| 109 | + |
| 110 | +4. **Custom Lending Workflows**: Adapt for specific lending scenarios: |
| 111 | + ```bash |
| 112 | + cp -r config_library/pattern-2/default-lending config_library/pattern-2/mortgage-processing |
| 113 | + ``` |
| 114 | + |
| 115 | +## Common Customization Scenarios |
| 116 | + |
| 117 | +### Adding New Financial Document Classes |
| 118 | + |
| 119 | +To add additional lending document types (e.g., tax returns, employment verification letters): |
| 120 | + |
| 121 | +1. Add a new entry to the `classes` array: |
| 122 | + ```yaml |
| 123 | + - name: tax_return |
| 124 | + description: Individual or business tax return documents containing income and deduction information |
| 125 | + attributes: |
| 126 | + - name: tax_year |
| 127 | + description: The tax year for which the return was filed. Look for 'Tax Year' or year designation at the top of the form. |
| 128 | + - name: filing_status |
| 129 | + description: The taxpayer's filing status such as Single, Married Filing Jointly, etc. |
| 130 | + ``` |
| 131 | +
|
| 132 | +2. Test with representative tax return documents. |
| 133 | +
|
| 134 | +### Customizing Extraction Prompts for Compliance |
| 135 | +
|
| 136 | +For enhanced compliance and audit requirements: |
| 137 | +
|
| 138 | +1. Modify the extraction `task_prompt` to include compliance-specific instructions: |
| 139 | + ```yaml |
| 140 | + task_prompt: | |
| 141 | + Extract financial information with particular attention to: |
| 142 | + - Verification of income sources and amounts |
| 143 | + - Identification of any discrepancies or missing information |
| 144 | + - Compliance with lending regulatory requirements |
| 145 | + ``` |
| 146 | + |
| 147 | +### Adjusting Confidence Thresholds for Financial Data |
| 148 | + |
| 149 | +For critical lending decisions, you may want higher confidence thresholds: |
| 150 | + |
| 151 | +1. Update the `default_confidence_threshold` in the assessment section: |
| 152 | + ```yaml |
| 153 | + assessment: |
| 154 | + default_confidence_threshold: '0.95' # Higher threshold for financial data |
| 155 | + ``` |
| 156 | + |
| 157 | +### Regional Customization |
| 158 | + |
| 159 | +For different geographic regions with varying document formats: |
| 160 | + |
| 161 | +1. Create region-specific configurations: |
| 162 | + ```bash |
| 163 | + cp -r default-lending default-lending-ca # Canadian lending documents |
| 164 | + cp -r default-lending default-lending-uk # UK lending documents |
| 165 | + ``` |
| 166 | + |
| 167 | +2. Modify document classes and attributes for regional requirements. |
| 168 | + |
| 169 | +## Performance Considerations |
| 170 | + |
| 171 | +The default-lending configuration is optimized for: |
| 172 | + |
| 173 | +- **High Accuracy**: Temperature 0 and elevated confidence thresholds for reliable financial data extraction |
| 174 | +- **Comprehensive Coverage**: Detailed attribute definitions covering all critical lending information |
| 175 | +- **Compliance**: Structured outputs suitable for regulatory documentation and audit trails |
| 176 | +- **Scalability**: Granular assessment with parallel processing for high-volume lending workflows |
| 177 | + |
| 178 | +### Financial Data Specific Optimizations |
| 179 | + |
| 180 | +- **OCR Enhancement**: Uses SIGNATURES feature to detect signed documents |
| 181 | +- **Table Processing**: TABLES feature for structured financial data in statements |
| 182 | +- **Layout Preservation**: LAYOUT feature maintains document structure for complex forms |
| 183 | +- **Extended Token Limits**: 10,000 tokens for comprehensive financial document processing |
| 184 | + |
| 185 | +## Security and Compliance Considerations |
| 186 | + |
| 187 | +When processing lending documents: |
| 188 | + |
| 189 | +- **Data Privacy**: Ensure compliance with financial privacy regulations (GLBA, CCPA, etc.) |
| 190 | +- **Encryption**: Use encrypted storage and transmission for all financial documents |
| 191 | +- **Access Controls**: Implement proper authentication and authorization |
| 192 | +- **Audit Logging**: Maintain comprehensive logs for regulatory compliance |
| 193 | +- **Data Retention**: Follow applicable data retention policies for financial records |
| 194 | + |
| 195 | +## Integration Guidelines |
| 196 | + |
| 197 | +### Loan Origination Systems (LOS) |
| 198 | + |
| 199 | +This configuration can be integrated with popular LOS platforms: |
| 200 | +- Automated document classification upon upload |
| 201 | +- Real-time data extraction for loan application prefill |
| 202 | +- Exception handling for documents requiring manual review |
| 203 | + |
| 204 | +### Credit Decisioning |
| 205 | + |
| 206 | +Extracted data can feed directly into credit decisioning engines: |
| 207 | +- Income verification from payslips and W2s |
| 208 | +- Asset verification from bank statements |
| 209 | +- Identity verification from driver's licenses |
| 210 | + |
| 211 | +## Contributors |
| 212 | + |
| 213 | +- GenAI IDP Accelerator Team |
| 214 | +- Lending Solutions Architecture Team |
0 commit comments