You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: config_library/pattern-2/lending-package-sample/config.yaml
+69-43Lines changed: 69 additions & 43 deletions
Original file line number
Diff line number
Diff line change
@@ -1150,134 +1150,160 @@ assessment:
1150
1150
max_workers: "20"
1151
1151
simple_batch_size: "3"
1152
1152
list_batch_size: "1"
1153
+
bounding_boxes:
1154
+
enabled: false
1153
1155
default_confidence_threshold: '0.8'
1154
1156
top_p: '0.1'
1155
1157
max_tokens: '10000'
1156
1158
top_k: '5'
1157
1159
temperature: '0.0'
1158
1160
model: us.amazon.nova-lite-v1:0
1159
1161
system_prompt: >-
1160
-
You are a document analysis assessment expert. Your task is to evaluate the confidence of extraction results by analyzing the source document evidence. Respond only with JSON containing confidence scores for each extracted attribute.
1162
+
You are a document analysis assessment expert. Your role is to evaluate the confidence and accuracy of data extraction results by analyzing them against source documents.
1163
+
1164
+
Provide accurate confidence scores and clear reasoning for each assessment.
1165
+
When bounding boxes are requested, provide precise coordinate locations where information appears in the document.
1161
1166
task_prompt: >-
1162
1167
<background>
1163
-
1164
-
You are an expert document analysis assessment system. Your task is to evaluate the confidence of extraction results for a document of class {DOCUMENT_CLASS}.
1165
-
1168
+
You are an expert document analysis assessment system. Your task is to evaluate the confidence of extraction results for a document of class {DOCUMENT_CLASS} and provide precise spatial localization for each field.
1166
1169
</background>
1167
1170
1168
-
1169
1171
<task>
1170
-
1171
-
Analyze the extraction results against the source document and provide confidence assessments for each extracted attribute. Consider factors such as:
1172
-
1173
-
1. Text clarity and OCR quality in the source regions
1174
-
2. Alignment between extracted values and document content
1175
-
3. Presence of clear evidence supporting the extraction
1176
-
4. Potential ambiguity or uncertainty in the source material
1172
+
Analyze the extraction results against the source document and provide confidence assessments AND bounding box coordinates for each extracted attribute. Consider factors such as:
1173
+
1. Text clarity and OCR quality in the source regions
1174
+
2. Alignment between extracted values and document content
1175
+
3. Presence of clear evidence supporting the extraction
1176
+
4. Potential ambiguity or uncertainty in the source material
1177
1177
5. Completeness and accuracy of the extracted information
1178
-
1178
+
6. Precise spatial location of each field in the document
1179
1179
</task>
1180
1180
1181
-
1182
1181
<assessment-guidelines>
1183
-
1184
-
For each attribute, provide:
1185
-
A confidence score between 0.0 and 1.0 where:
1182
+
For each attribute, provide:
1183
+
- A confidence score between 0.0 and 1.0 where:
1186
1184
- 1.0 = Very high confidence, clear and unambiguous evidence
1187
1185
- 0.8-0.9 = High confidence, strong evidence with minor uncertainty
1188
1186
- 0.6-0.7 = Medium confidence, reasonable evidence but some ambiguity
1189
1187
- 0.4-0.5 = Low confidence, weak or unclear evidence
1190
1188
- 0.0-0.3 = Very low confidence, little to no supporting evidence
1191
-
1192
-
Guidelines:
1193
-
- Base assessments on actual document content and OCR quality
1194
-
- Consider both text-based evidence and visual/layout clues
1195
-
- Account for OCR confidence scores when provided
1196
-
- Be objective and specific in reasoning
1189
+
- A clear explanation of the confidence reasoning
1190
+
- Precise spatial coordinates where the field appears in the document
1191
+
1192
+
Guidelines:
1193
+
- Base assessments on actual document content and OCR quality
1194
+
- Consider both text-based evidence and visual/layout clues
1195
+
- Account for OCR confidence scores when provided
1196
+
- Be objective and specific in reasoning
1197
1197
- If an extraction appears incorrect, score accordingly with explanation
1198
-
1198
+
- Provide tight, accurate bounding boxes around the actual text
- page: Page number where the field appears (starting from 1)
1205
+
1206
+
Coordinate system:
1207
+
- Use normalized scale 0-1000 for both x and y axes
1208
+
- x1, y1 = top-left corner of bounding box
1209
+
- x2, y2 = bottom-right corner of bounding box
1210
+
- Ensure x2 > x1 and y2 > y1
1211
+
- Make bounding boxes tight around the actual text content
1212
+
- If a field spans multiple lines, create a bounding box that encompasses all relevant text
1213
+
</spatial-localization-guidelines>
1202
1214
1203
-
Analyze the extraction results against the source document and provide confidence assessments. Return a JSON object with the following structure based on the attribute type:
1215
+
<final-instructions>
1216
+
Analyze the extraction results against the source document and provide confidence assessments with spatial localization. Return a JSON object with the following structure based on the attribute type:
1204
1217
1205
-
For SIMPLE attributes:
1218
+
For SIMPLE attributes:
1206
1219
{
1207
1220
"simple_attribute_name": {
1208
1221
"confidence": 0.85,
1222
+
"confidence_reason": "Clear text with high OCR confidence, easily identifiable location",
0 commit comments