You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+10Lines changed: 10 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,11 @@ SPDX-License-Identifier: MIT-0
14
14
- Fixed view toggle behavior - switching between views no longer closes the viewer window
15
15
- Reordered view buttons to: Markdown View, Text Confidence View, Text View for better user experience
16
16
17
+
-**Enhanced OCR DPI Configuration for PDF files**
18
+
- DPI for PDF image conversion is now configurable in the configuration editor under OCR image processing settings
19
+
- Default DPI improved from 96 to 150 DPI for better default quality and OCR accuracy
20
+
- Configurable through Web UI without requiring code changes or redeployment
21
+
17
22
### Changed
18
23
-**Converted text confidence data format from JSON to markdown table for improved readability and reduced token usage**
19
24
- Removed unnecessary "page_count" field
@@ -26,8 +31,13 @@ SPDX-License-Identifier: MIT-0
26
31
- Aligned with classification service pattern for better consistency across IDP services
27
32
- Backward compatibility maintained - old parameter pattern still supported with deprecation warning
28
33
- Updated all lambda functions and notebooks to use new simplified pattern
34
+
- Removed fixed image target_height and target_width from default configurations, so images are processed in original resolution by default.
35
+
29
36
30
37
### Fixed
38
+
-**Fixed Image Resizing Behavior for High-Resolution Documents**
39
+
- Fixed issue where empty strings in image configuration were incorrectly resizing images to default 951x1268 pixels instead of preserving original resolution
40
+
- Empty strings (`""`) in `target_width` and `target_height` configuration now preserve original document resolution for maximum processing accuracy
31
41
- Fixed issue where PNG files were being unnecessarily converted to JPEG format and resized to lower resolution with lost quality
32
42
- Fixed issue where PNG and JPG image files were not rendering inline in the Document Details page
33
43
- Fixed issue where PDF files were being downloaded instead of displayed inline
You are a document summarization expert who can analyze and summarize documents from various domains including medical, financial, legal, and general business documents. Your task is to create a summary that captures the key information, main points, and important details from the document. Your output must be in valid JSON format. \nSummarization Style: Balanced\\nCreate a balanced summary that provides a moderate level of detail. Include the main points and key supporting information, while maintaining the document's overall structure. Aim for a comprehensive yet concise summary.\n Your output MUST be in valid JSON format with markdown content. You MUST strictly adhere to the output format specified in the instructions.
You are a document summarization expert who can analyze and summarize documents from various domains including medical, financial, legal, and general business documents. Your task is to create a summary that captures the key information, main points, and important details from the document. Your output must be in valid JSON format. \nSummarization Style: Balanced\\nCreate a balanced summary that provides a moderate level of detail. Include the main points and key supporting information, while maintaining the document's overall structure. Aim for a comprehensive yet concise summary.\n Your output MUST be in valid JSON format with markdown content. You MUST strictly adhere to the output format specified in the instructions.
Copy file name to clipboardExpand all lines: config_library/pattern-3/default/config.yaml
+7-6Lines changed: 7 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -12,8 +12,9 @@ ocr:
12
12
- name: TABLES
13
13
- name: SIGNATURES
14
14
image:
15
-
target_width: '951'
16
-
target_height: '1268'
15
+
dpi: '150'
16
+
target_width: ''
17
+
target_height: ''
17
18
classes:
18
19
- name: letter
19
20
description: A formal written correspondence with sender/recipient addresses, date, salutation, body, and closing signature
@@ -310,8 +311,8 @@ classification:
310
311
model: Custom fine tuned UDOP model
311
312
extraction:
312
313
image:
313
-
target_width: '951'
314
-
target_height: '1268'
314
+
target_width: ''
315
+
target_height: ''
315
316
top_p: '0.1'
316
317
max_tokens: '10000'
317
318
top_k: '5'
@@ -468,8 +469,8 @@ summarization:
468
469
You are a document summarization expert who can analyze and summarize documents from various domains including medical, financial, legal, and general business documents. Your task is to create a summary that captures the key information, main points, and important details from the document. Your output must be in valid JSON format. \nSummarization Style: Balanced\\nCreate a balanced summary that provides a moderate level of detail. Include the main points and key supporting information, while maintaining the document's overall structure. Aim for a comprehensive yet concise summary.\n Your output MUST be in valid JSON format with markdown content. You MUST strictly adhere to the output format specified in the instructions.
0 commit comments