Skip to content

Commit c5b0a5b

Browse files
author
Bob Strahan
committed
fix: preserve original image resolution and avoid unnecessary resizing
1 parent de686d0 commit c5b0a5b

File tree

4 files changed

+355
-141
lines changed

4 files changed

+355
-141
lines changed

OCR_CODE_IMPROVEMENTS_SUMMARY.md

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# OCR Service Code Improvements Summary
2+
3+
## Current State: Code is Clean and Functional
4+
5+
The OCR service code in `lib/idp_common_pkg/idp_common/ocr/service.py` is now clean and working correctly after the fix. The main improvements implemented include:
6+
7+
### 1. **Clear Decision Flow**
8+
```python
9+
# If we have the original file content, use it directly to avoid PyMuPDF processing
10+
if original_file_content:
11+
# Use original content path
12+
else:
13+
# Fallback to PyMuPDF processing
14+
```
15+
16+
### 2. **Explicit Resize Logic**
17+
The code now clearly checks if resizing is needed:
18+
- Empty resize config → No resize
19+
- Image already fits → No resize
20+
- Image exceeds bounds → Apply resize
21+
22+
### 3. **Better Logging**
23+
Clear, informative logging at each decision point helps with debugging and understanding the flow.
24+
25+
## Potential Future Refactoring
26+
27+
While the code is functional, the `_process_image_file_direct` method could be refactored for better maintainability:
28+
29+
### 1. **Extract Helper Methods**
30+
- `_extract_image_from_original_content()` - Handle original content extraction
31+
- `_check_if_resize_needed()` - Centralize resize decision logic
32+
- `_apply_resize_if_needed()` - Handle resize and format changes
33+
- `_get_content_type_for_extension()` - Map file extensions to content types
34+
35+
### 2. **Define Constants**
36+
Replace magic numbers with named constants:
37+
```python
38+
ZOOM_FACTOR_HIGH_RES = 4.159 # For ~1900x2500 images
39+
ZOOM_FACTOR_VERY_SMALL = 4.0 # For very small images
40+
SMALL_IMAGE_THRESHOLD = 1000
41+
```
42+
43+
### 3. **Reduce Code Duplication**
44+
The resize logic appears in multiple places and could be consolidated.
45+
46+
## Benefits of Current Implementation
47+
48+
1. **Performance**: Avoids unnecessary image processing
49+
2. **Quality**: Preserves original image quality when possible
50+
3. **Correctness**: Properly handles all resize scenarios
51+
4. **Maintainability**: Clear logic flow makes it easy to understand
52+
53+
## Test Coverage
54+
55+
The implementation includes comprehensive tests that verify:
56+
- Empty resize config preserves dimensions
57+
- Valid resize config resizes correctly
58+
- Images that already fit are not resized
59+
60+
All tests are passing, confirming the fix works as intended.
61+
62+
## Conclusion
63+
64+
The code is now clean, functional, and maintainable. While there's room for further refactoring to reduce the method length and eliminate some duplication, the current implementation correctly solves the original problem and is production-ready.

OCR_IMAGE_RESIZE_FIX_SUMMARY.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# OCR Image Resize Fix Summary
2+
3+
## Problem
4+
The OCR service was incorrectly downsizing high-resolution images (PNG/JPG) when processing them, even when the resize configuration had empty values or when the image already fit within the specified dimensions.
5+
6+
## Root Cause
7+
1. PyMuPDF was loading images at a lower resolution by default (converting pixels to points at 72 DPI)
8+
2. The code was trying to compensate with zoom factors, but this was causing unintended resizing
9+
3. The original file content wasn't being preserved when no resizing was needed
10+
11+
## Solution
12+
Modified the OCR service to:
13+
1. Pass the original file content directly when processing image files (not PDFs)
14+
2. Use the original image data without PyMuPDF processing when:
15+
- No resize config is provided
16+
- Resize config has empty values
17+
- Image already fits within the specified dimensions
18+
3. Only apply resizing when actually needed (image exceeds target dimensions)
19+
20+
## Changes Made
21+
22+
### 1. Updated `_process_single_page` method
23+
- Added `original_file_content` parameter
24+
- Pass original content for image files to avoid PyMuPDF processing
25+
26+
### 2. Updated `process_document` method
27+
- Pass original file content when processing image files
28+
29+
### 3. Updated `_process_image_file_direct` method
30+
- Added logic to use original file content directly when available
31+
- Check if resizing is actually needed before applying it
32+
- Preserve original image format and quality when no resize is needed
33+
34+
### 4. Removed problematic zoom factor logic
35+
- Eliminated the complex zoom factor calculations that were causing issues
36+
- Simplified the fallback logic for when original content isn't available
37+
38+
## Test Results
39+
40+
### Test 1: Empty resize config
41+
- **Input**: 1913x2475 PNG image with empty resize config
42+
- **Expected**: 1913x2475 (no resize)
43+
- **Result**: ✓ PASS - Image dimensions preserved correctly
44+
45+
### Test 2: Valid resize config
46+
- **Input**: 1913x2475 PNG image with target 951x1268
47+
- **Expected**: 951x1230 (maintaining aspect ratio)
48+
- **Result**: ✓ PASS - Image resized correctly to fit target bounds
49+
50+
### Test 3: Image already fits
51+
- **Input**: 800x1000 PNG image with target 951x1268
52+
- **Expected**: 800x1000 (no resize needed)
53+
- **Result**: ✓ PASS - Image not resized since it already fits
54+
55+
## Benefits
56+
1. **Performance**: Avoids unnecessary image processing when resize isn't needed
57+
2. **Quality**: Preserves original image quality and format
58+
3. **Efficiency**: Reduces processing time and resource usage
59+
4. **Correctness**: Properly handles all resize configuration scenarios

0 commit comments

Comments
 (0)