Skip to content

Commit 95dd13d

Browse files
author
Bob Strahan
committed
fix: preserve original image resolution when empty dimensions are specified
1 parent c4a16e2 commit 95dd13d

File tree

5 files changed

+206
-44
lines changed

5 files changed

+206
-44
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ SPDX-License-Identifier: MIT-0
2828
- Updated all lambda functions and notebooks to use new simplified pattern
2929

3030
### Fixed
31+
- Fixed issue where empty image configuration were incorrectly resizing images to default 951x1268 pixels instead of preserving original resolution
3132
- Fixed issue where PNG files were being unnecessarily converted to JPEG format and resized to lower resolution with lost quality
3233
- Fixed issue where PNG and JPG image files were not rendering inline in the Document Details page
3334
- Fixed issue where PDF files were being downloaded instead of displayed inline

docs/assessment.md

Lines changed: 41 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -536,49 +536,76 @@ StateTaxes[0]:
536536
537537
The assessment service supports configurable image dimensions for optimal confidence evaluation:
538538
539-
### Default Configuration
539+
### New Default Behavior (Preserves Original Resolution)
540+
541+
**Important Change**: Empty strings or unspecified image dimensions now preserve the original document resolution for maximum assessment accuracy:
540542
541543
```yaml
542544
assessment:
543545
model: "anthropic.claude-3-5-sonnet-20241022-v2:0"
544-
# Image processing settings
546+
# Image processing settings - preserves original resolution
545547
image:
546-
target_width: 951 # Default width in pixels
547-
target_height: 1268 # Default height in pixels
548+
target_width: "" # Empty string = no resizing (recommended)
549+
target_height: "" # Empty string = no resizing (recommended)
548550
```
549551

550552
### Custom Image Dimensions
551553

552-
Configure image dimensions based on assessment requirements:
554+
Configure specific dimensions when performance optimization is needed:
553555

554556
```yaml
555-
# For detailed visual assessment
557+
# For detailed visual assessment with controlled dimensions
556558
assessment:
557559
image:
558-
target_width: 1200
559-
target_height: 1600
560+
target_width: "1200" # Resize to 1200 pixels wide
561+
target_height: "1600" # Resize to 1600 pixels tall
560562

561563
# For standard confidence evaluation
562564
assessment:
563565
image:
564-
target_width: 800
565-
target_height: 1000
566+
target_width: "800" # Smaller for faster processing
567+
target_height: "1000" # Maintains good quality
566568
```
567569
568570
### Image Resizing Features for Assessment
569571
570-
- **Aspect Ratio Preservation**: Images maintain proportions for accurate visual analysis
572+
- **Original Resolution Preservation**: Empty strings preserve full document resolution for maximum assessment accuracy
573+
- **Aspect Ratio Preservation**: Images maintain proportions for accurate visual analysis when dimensions are specified
571574
- **Smart Scaling**: Only downsizes when necessary to preserve visual detail
572575
- **High-Quality Resampling**: Better image quality for confidence assessment
573-
- **Performance Optimization**: Optimized images reduce assessment processing time
576+
- **Performance Optimization**: Configurable dimensions allow balancing accuracy vs. speed
574577
575578
### Configuration Benefits for Assessment
576579
577-
- **Enhanced Visual Analysis**: Appropriate resolution improves confidence evaluation accuracy
580+
- **Maximum Assessment Accuracy**: Empty strings preserve full document resolution for best confidence evaluation
581+
- **Enhanced Visual Analysis**: Original resolution improves confidence evaluation accuracy
578582
- **Better OCR Verification**: Higher quality images help verify extraction results against visual content
579583
- **Improved Confidence Scoring**: Better image quality leads to more accurate confidence assessments
580584
- **Service-Specific Tuning**: Optimize image dimensions for different assessment complexity levels
581-
- **Resource Optimization**: Balance assessment quality and processing costs
585+
- **Resource Optimization**: Choose between accuracy (original resolution) and performance (smaller dimensions)
586+
587+
### Migration from Previous Versions
588+
589+
**Previous Behavior**: Empty strings defaulted to 951x1268 pixel resizing
590+
**New Behavior**: Empty strings preserve original image resolution
591+
592+
If you were relying on the previous default resizing behavior, explicitly set dimensions:
593+
594+
```yaml
595+
# To maintain previous default behavior
596+
assessment:
597+
image:
598+
target_width: "951"
599+
target_height: "1268"
600+
```
601+
602+
### Best Practices for Assessment
603+
604+
1. **Use Empty Strings for High Accuracy**: For critical confidence assessment, use empty strings to preserve original resolution
605+
2. **Consider Assessment Complexity**: Complex documents with fine details benefit from higher resolution
606+
3. **Test Assessment Quality**: Evaluate confidence assessment accuracy with your specific document types
607+
4. **Monitor Resource Usage**: Higher resolution images consume more memory and processing time
608+
5. **Balance Accuracy vs Performance**: Choose appropriate settings based on your assessment requirements and processing volume
582609
583610
## Granular Assessment
584611

docs/classification.md

Lines changed: 41 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -341,49 +341,75 @@ For comprehensive details on configuring few-shot examples, including multimodal
341341

342342
The classification service supports configurable image dimensions for optimal performance and quality:
343343

344-
### Default Configuration
344+
### New Default Behavior (Preserves Original Resolution)
345+
346+
**Important Change**: Empty strings or unspecified image dimensions now preserve the original document resolution for maximum classification accuracy:
345347

346348
```yaml
347349
classification:
348350
model: us.amazon.nova-pro-v1:0
349-
# Image processing settings
351+
# Image processing settings - preserves original resolution
350352
image:
351-
target_width: 951 # Default width in pixels
352-
target_height: 1268 # Default height in pixels
353+
target_width: "" # Empty string = no resizing (recommended)
354+
target_height: "" # Empty string = no resizing (recommended)
353355
```
354356

355357
### Custom Image Dimensions
356358

357-
Configure image dimensions based on your specific requirements:
359+
Configure specific dimensions when performance optimization is needed:
358360

359361
```yaml
360-
# For high-accuracy classification
362+
# For high-accuracy classification with controlled dimensions
361363
classification:
362364
image:
363-
target_width: 1200
364-
target_height: 1600
365+
target_width: "1200" # Resize to 1200 pixels wide
366+
target_height: "1600" # Resize to 1600 pixels tall
365367
366368
# For fast processing with lower resolution
367369
classification:
368370
image:
369-
target_width: 600
370-
target_height: 800
371+
target_width: "600" # Smaller for faster processing
372+
target_height: "800" # Maintains reasonable quality
371373
```
372374

373375
### Image Resizing Features
374376

375-
- **Aspect Ratio Preservation**: Images are resized proportionally without distortion
377+
- **Original Resolution Preservation**: Empty strings preserve full document resolution for maximum accuracy
378+
- **Aspect Ratio Preservation**: Images are resized proportionally without distortion when dimensions are specified
376379
- **Smart Scaling**: Only downsizes images when necessary (scale factor < 1.0)
377380
- **High-Quality Resampling**: Better visual quality after resizing
378-
- **Performance Optimization**: Smaller, optimized images process faster with lower memory usage
381+
- **Performance Optimization**: Configurable dimensions allow balancing accuracy vs. speed
379382

380383
### Configuration Benefits
381384

385+
- **Maximum Classification Accuracy**: Empty strings preserve full document resolution for best results
382386
- **Service-Specific Tuning**: Each service can use optimal image dimensions
383387
- **Runtime Configuration**: No code changes needed to adjust image processing
384-
- **Backward Compatibility**: Default values maintain existing behavior
385-
- **Memory Optimization**: Configurable dimensions allow memory optimization
386-
- **Better Resource Utilization**: Service-specific sizing reduces unnecessary processing
388+
- **Backward Compatibility**: Existing numeric values continue to work as before
389+
- **Memory Optimization**: Configurable dimensions allow resource optimization
390+
- **Better Resource Utilization**: Choose between accuracy (original resolution) and performance (smaller dimensions)
391+
392+
### Migration from Previous Versions
393+
394+
**Previous Behavior**: Empty strings defaulted to 951x1268 pixel resizing
395+
**New Behavior**: Empty strings preserve original image resolution
396+
397+
If you were relying on the previous default resizing behavior, explicitly set dimensions:
398+
399+
```yaml
400+
# To maintain previous default behavior
401+
classification:
402+
image:
403+
target_width: "951"
404+
target_height: "1268"
405+
```
406+
407+
### Best Practices for Classification
408+
409+
1. **Use Empty Strings for High Accuracy**: For critical document classification, use empty strings to preserve original resolution
410+
2. **Consider Document Types**: Complex layouts benefit from higher resolution, simple text documents may work well with smaller dimensions
411+
3. **Test Performance Impact**: Higher resolution images provide better accuracy but consume more resources
412+
4. **Monitor Processing Time**: Balance classification accuracy with processing speed based on your requirements
387413

388414
## JSON and YAML Output Support
389415

docs/configuration.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -241,6 +241,87 @@ The solution includes built-in cost tracking capabilities:
241241

242242
For detailed cost analysis and optimization strategies, see [cost-calculator.md](cost-calculator.md).
243243

244+
## Image Processing Configuration
245+
246+
The solution supports configurable image dimensions across all processing services (OCR, classification, extraction, and assessment) to optimize performance and accuracy for different document types.
247+
248+
### New Default Behavior (Preserves Original Resolution)
249+
250+
**Important Change**: As of the latest version, empty strings or unspecified image dimensions now preserve the original document resolution instead of resizing to default dimensions.
251+
252+
```yaml
253+
# Preserves original image resolution (recommended for high-accuracy processing)
254+
classification:
255+
image:
256+
target_width: "" # Empty string = no resizing
257+
target_height: "" # Empty string = no resizing
258+
259+
extraction:
260+
image:
261+
target_width: "" # Preserves original resolution
262+
target_height: "" # Preserves original resolution
263+
264+
assessment:
265+
image:
266+
target_width: "" # No resizing applied
267+
target_height: "" # No resizing applied
268+
```
269+
270+
### Custom Image Dimensions
271+
272+
You can still specify exact dimensions when needed for performance optimization:
273+
274+
```yaml
275+
# Custom dimensions for specific requirements
276+
classification:
277+
image:
278+
target_width: "1200" # Resize to 1200 pixels wide
279+
target_height: "1600" # Resize to 1600 pixels tall
280+
281+
# Performance-optimized dimensions
282+
extraction:
283+
image:
284+
target_width: "800" # Smaller for faster processing
285+
target_height: "1000" # Maintains good quality
286+
```
287+
288+
### Image Resizing Features
289+
290+
- **Aspect Ratio Preservation**: Images are resized proportionally without distortion
291+
- **Smart Scaling**: Only downsizes images when necessary (scale factor < 1.0)
292+
- **High-Quality Resampling**: Better visual quality after resizing
293+
- **Original Format Preservation**: Maintains PNG, JPEG, and other formats when possible
294+
295+
### Configuration Benefits
296+
297+
- **High-Resolution Processing**: Empty strings preserve full document resolution for maximum OCR accuracy
298+
- **Service-Specific Tuning**: Each service can use optimal image dimensions
299+
- **Runtime Configuration**: No code changes needed to adjust image processing
300+
- **Backward Compatibility**: Existing numeric values continue to work as before
301+
- **Memory Optimization**: Configurable dimensions allow resource optimization
302+
303+
### Best Practices
304+
305+
1. **Use Empty Strings for High Accuracy**: For critical documents requiring maximum OCR accuracy, use empty strings to preserve original resolution
306+
2. **Specify Dimensions for Performance**: For high-volume processing, consider smaller dimensions to improve speed
307+
3. **Test Different Settings**: Evaluate the trade-off between accuracy and performance for your specific document types
308+
4. **Monitor Resource Usage**: Higher resolution images consume more memory and processing time
309+
310+
### Migration from Previous Versions
311+
312+
**Previous Behavior**: Empty strings defaulted to 951x1268 pixel resizing
313+
**New Behavior**: Empty strings preserve original image resolution
314+
315+
If you were relying on the previous default resizing behavior, explicitly set dimensions:
316+
317+
```yaml
318+
# To maintain previous default behavior
319+
classification:
320+
image:
321+
target_width: "951"
322+
target_height: "1268"
323+
```
324+
244325
## Additional Configuration Resources
245326
246327
The solution provides additional configuration options through:

docs/extraction.md

Lines changed: 42 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -403,50 +403,77 @@ Examples are class-specific - only examples from the same document class being p
403403

404404
The extraction service supports configurable image dimensions for optimal performance and quality:
405405

406-
### Default Configuration
406+
### New Default Behavior (Preserves Original Resolution)
407+
408+
**Important Change**: Empty strings or unspecified image dimensions now preserve the original document resolution for maximum extraction accuracy:
407409

408410
```yaml
409411
extraction:
410412
model: us.amazon.nova-pro-v1:0
411-
# Image processing settings
413+
# Image processing settings - preserves original resolution
412414
image:
413-
target_width: 951 # Default width in pixels
414-
target_height: 1268 # Default height in pixels
415+
target_width: "" # Empty string = no resizing (recommended)
416+
target_height: "" # Empty string = no resizing (recommended)
415417
```
416418

417419
### Custom Image Dimensions
418420

419-
Configure image dimensions based on your extraction requirements:
421+
Configure specific dimensions when performance optimization is needed:
420422

421423
```yaml
422-
# For high-accuracy extraction with detailed visual analysis
424+
# For high-accuracy extraction with controlled dimensions
423425
extraction:
424426
image:
425-
target_width: 1200
426-
target_height: 1600
427+
target_width: "1200" # Resize to 1200 pixels wide
428+
target_height: "1600" # Resize to 1600 pixels tall
427429
428430
# For fast processing with standard resolution
429431
extraction:
430432
image:
431-
target_width: 800
432-
target_height: 1000
433+
target_width: "800" # Smaller for faster processing
434+
target_height: "1000" # Maintains good quality
433435
```
434436

435437
### Image Resizing Features
436438

437-
- **Aspect Ratio Preservation**: Images are resized proportionally without distortion
439+
- **Original Resolution Preservation**: Empty strings preserve full document resolution for maximum extraction accuracy
440+
- **Aspect Ratio Preservation**: Images are resized proportionally without distortion when dimensions are specified
438441
- **Smart Scaling**: Only downsizes images when necessary (scale factor < 1.0)
439442
- **High-Quality Resampling**: Better visual quality after resizing for improved field detection
440-
- **Performance Optimization**: Optimized images reduce processing time and memory usage
443+
- **Performance Optimization**: Configurable dimensions allow balancing accuracy vs. speed
441444

442445
### Configuration Benefits for Extraction
443446

444-
- **Enhanced Field Detection**: Appropriate image resolution improves accuracy for table and form extraction
445-
- **Visual Element Processing**: Better handling of signatures, stamps, checkboxes, and visual indicators
447+
- **Maximum Extraction Accuracy**: Empty strings preserve full document resolution for best field detection
448+
- **Enhanced Field Detection**: Original resolution improves accuracy for table and form extraction
449+
- **Visual Element Processing**: Better handling of signatures, stamps, checkboxes, and visual indicators at full resolution
446450
- **OCR Error Correction**: Higher quality images help verify and correct text extraction results
447451
- **Service-Specific Tuning**: Optimize image dimensions for different document types and extraction complexity
448452
- **Runtime Configuration**: Adjust image processing without code changes
449-
- **Resource Optimization**: Balance quality and performance based on extraction requirements
453+
- **Resource Optimization**: Choose between accuracy (original resolution) and performance (smaller dimensions)
454+
455+
### Migration from Previous Versions
456+
457+
**Previous Behavior**: Empty strings defaulted to 951x1268 pixel resizing
458+
**New Behavior**: Empty strings preserve original image resolution
459+
460+
If you were relying on the previous default resizing behavior, explicitly set dimensions:
461+
462+
```yaml
463+
# To maintain previous default behavior
464+
extraction:
465+
image:
466+
target_width: "951"
467+
target_height: "1268"
468+
```
469+
470+
### Best Practices for Extraction
471+
472+
1. **Use Empty Strings for High Accuracy**: For critical data extraction, use empty strings to preserve original resolution
473+
2. **Consider Document Complexity**: Forms and tables benefit significantly from higher resolution
474+
3. **Test with Representative Documents**: Evaluate extraction accuracy with your specific document types
475+
4. **Monitor Resource Usage**: Higher resolution images consume more memory and processing time
476+
5. **Balance Accuracy vs Performance**: Choose appropriate settings based on your accuracy requirements and processing volume
450477

451478
## JSON and YAML Output Support
452479

0 commit comments

Comments
 (0)