Skip to content

Commit 172f57b

Browse files
author
Bob Strahan
committed
Update OCR README with memory-optimized image extraction documentation
1 parent f7a4815 commit 172f57b

File tree

1 file changed

+23
-4
lines changed

1 file changed

+23
-4
lines changed

lib/idp_common_pkg/idp_common/ocr/README.md

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -122,19 +122,38 @@ ocr:
122122
task_prompt: "Extract all text from this image..."
123123
```
124124
125+
### Memory-Optimized Image Extraction
126+
127+
The OCR service uses advanced memory optimization to prevent OutOfMemory errors when processing large high-resolution documents:
128+
129+
**Direct Size Extraction**: When resize configuration is provided (`target_width` and `target_height`), images are extracted directly at the target dimensions using PyMuPDF matrix transformations. This completely eliminates memory spikes from creating oversized images.
130+
131+
**Example for Large Document:**
132+
- **Original approach**: Extract 7469×9623 (101MB) → Resize to 951×1268 (5MB) → Memory spike
133+
- **Optimized approach**: Extract directly at 951×1268 (5MB) → No memory spike
134+
135+
**Preserved Logic**: The optimization maintains all existing resize behavior:
136+
- ✅ Never upscales images (only applies scaling when scale_factor < 1.0)
137+
- ✅ Preserves aspect ratio using `min(width_ratio, height_ratio)`
138+
- ✅ Handles edge cases (no config, images already smaller than targets)
139+
- ✅ Full backward compatibility
140+
125141
### DPI Configuration
126142

127-
The DPI (dots per inch) setting controls the resolution when extracting images from PDF pages:
143+
The DPI (dots per inch) setting controls the base resolution when extracting images from PDF pages:
128144
- **Default**: 150 DPI (good balance of quality and file size)
129-
- **Range**: 72-300 DPI
145+
- **Range**: 72-300 DPI
130146
- **Location**: `ocr.image.dpi` in the configuration
131147
- **Behavior**:
132148
- Only applies to PDF files (image files maintain their original resolution)
133-
- Higher DPI = better quality but larger file sizes
149+
- Combined with resize configuration for optimal memory usage
150+
- Higher DPI = better quality but larger file sizes (use with resize config for large documents)
134151
- 150 DPI is recommended for most OCR use cases
135-
- 300 DPI for documents with small text or fine details
152+
- 300 DPI for documents with small text or fine details (ensure resize config is set)
136153
- 100 DPI for simple documents to reduce processing time
137154

155+
**Memory Considerations**: For large documents with high DPI settings, always configure `target_width` and `target_height` to prevent memory issues. The service will intelligently extract at the optimal size.
156+
138157

139158
## Migration Guide
140159

0 commit comments

Comments
 (0)