Skip to content

Commit fefb392

Browse files
author
Bob Strahan
committed
Update assessment prompt to add bbox for all examples (confidence_reason removed to save tokens)
1 parent 94b4599 commit fefb392

File tree

5 files changed

+261
-203
lines changed

5 files changed

+261
-203
lines changed

config_library/pattern-2/bank-statement-sample/config.yaml

Lines changed: 66 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,7 @@ summarization:
368368
model: us.anthropic.claude-3-7-sonnet-20250219-v1:0
369369
system_prompt: >-
370370
You are a document summarization expert who can analyze and summarize documents from various domains including medical, financial, legal, and general business documents. Your task is to create a summary that captures the key information, main points, and important details from the document. Your output must be in valid JSON format. \nSummarization Style: Balanced\\nCreate a balanced summary that provides a moderate level of detail. Include the main points and key supporting information, while maintaining the document's overall structure. Aim for a comprehensive yet concise summary.\n Your output MUST be in valid JSON format with markdown content. You MUST strictly adhere to the output format specified in the instructions.
371+
371372
assessment:
372373
enabled: true
373374
image:
@@ -383,130 +384,146 @@ assessment:
383384
max_tokens: '10000'
384385
top_k: '5'
385386
temperature: '0.0'
386-
model: us.amazon.nova-pro-v1:0
387+
model: us.amazon.nova-lite-v1:0
387388
system_prompt: >-
388-
You are a document analysis assessment expert. Your task is to evaluate the confidence of extraction results by analyzing the source document evidence. Respond only with JSON containing confidence scores for each extracted attribute.
389+
You are a document analysis assessment expert. Your role is to evaluate the confidence and accuracy of data extraction results by analyzing them against source documents.
390+
391+
Provide accurate confidence scores for each assessment.
392+
When bounding boxes are requested, provide precise coordinate locations where information appears in the document.
389393
task_prompt: >-
390394
<background>
391-
392-
You are an expert document analysis assessment system. Your task is to evaluate the confidence of extraction results for a document of class {DOCUMENT_CLASS}.
393-
395+
You are an expert document analysis assessment system. Your task is to evaluate the confidence of extraction results for a document of class {DOCUMENT_CLASS} and provide precise spatial localization for each field.
394396
</background>
395397
396-
397398
<task>
398-
399-
Analyze the extraction results against the source document and provide confidence assessments for each extracted attribute. Consider factors such as:
400-
401-
1. Text clarity and OCR quality in the source regions
402-
2. Alignment between extracted values and document content
403-
3. Presence of clear evidence supporting the extraction
404-
4. Potential ambiguity or uncertainty in the source material
399+
Analyze the extraction results against the source document and provide confidence assessments AND bounding box coordinates for each extracted attribute. Consider factors such as:
400+
1. Text clarity and OCR quality in the source regions
401+
2. Alignment between extracted values and document content
402+
3. Presence of clear evidence supporting the extraction
403+
4. Potential ambiguity or uncertainty in the source material
405404
5. Completeness and accuracy of the extracted information
406-
405+
6. Precise spatial location of each field in the document
407406
</task>
408407
409-
410408
<assessment-guidelines>
411-
412-
For each attribute, provide:
413-
A confidence score between 0.0 and 1.0 where:
409+
For each attribute, provide:
410+
- A confidence score between 0.0 and 1.0 where:
414411
- 1.0 = Very high confidence, clear and unambiguous evidence
415412
- 0.8-0.9 = High confidence, strong evidence with minor uncertainty
416413
- 0.6-0.7 = Medium confidence, reasonable evidence but some ambiguity
417414
- 0.4-0.5 = Low confidence, weak or unclear evidence
418415
- 0.0-0.3 = Very low confidence, little to no supporting evidence
419-
420-
Guidelines:
421-
- Base assessments on actual document content and OCR quality
422-
- Consider both text-based evidence and visual/layout clues
423-
- Account for OCR confidence scores when provided
424-
- Be objective and specific in reasoning
416+
- A clear explanation of the confidence reasoning
417+
- Precise spatial coordinates where the field appears in the document
418+
419+
Guidelines:
420+
- Base assessments on actual document content and OCR quality
421+
- Consider both text-based evidence and visual/layout clues
422+
- Account for OCR confidence scores when provided
423+
- Be objective and specific in reasoning
425424
- If an extraction appears incorrect, score accordingly with explanation
426-
425+
- Provide tight, accurate bounding boxes around the actual text
427426
</assessment-guidelines>
428427
429-
<final-instructions>
428+
<spatial-localization-guidelines>
429+
For each field, provide bounding box coordinates:
430+
- bbox: [x1, y1, x2, y2] coordinates in normalized 0-1000 scale
431+
- page: Page number where the field appears (starting from 1)
432+
433+
Coordinate system:
434+
- Use normalized scale 0-1000 for both x and y axes
435+
- x1, y1 = top-left corner of bounding box
436+
- x2, y2 = bottom-right corner of bounding box
437+
- Ensure x2 > x1 and y2 > y1
438+
- Make bounding boxes tight around the actual text content
439+
- If a field spans multiple lines, create a bounding box that encompasses all relevant text
440+
</spatial-localization-guidelines>
430441
431-
Analyze the extraction results against the source document and provide confidence assessments. Return a JSON object with the following structure based on the attribute type:
442+
<final-instructions>
443+
Analyze the extraction results against the source document and provide confidence assessments with spatial localization. Return a JSON object with the following structure based on the attribute type:
432444
433-
For SIMPLE attributes:
445+
For SIMPLE attributes:
434446
{
435447
"simple_attribute_name": {
436448
"confidence": 0.85,
449+
"bbox": [100, 200, 300, 250],
450+
"page": 1
437451
}
438452
}
439453
440-
For GROUP attributes (nested object structure):
454+
For GROUP attributes (nested object structure):
441455
{
442456
"group_attribute_name": {
443457
"sub_attribute_1": {
444458
"confidence": 0.90,
459+
"bbox": [150, 300, 250, 320],
460+
"page": 1
445461
},
446462
"sub_attribute_2": {
447463
"confidence": 0.75,
464+
"bbox": [150, 325, 280, 345],
465+
"page": 1
448466
}
449467
}
450468
}
451469
452-
For LIST attributes (array of assessed items):
470+
For LIST attributes (array of assessed items):
453471
{
454472
"list_attribute_name": [
455473
{
456474
"item_attribute_1": {
457475
"confidence": 0.95,
476+
"bbox": [100, 400, 200, 420],
477+
"page": 1
458478
},
459479
"item_attribute_2": {
460480
"confidence": 0.88,
481+
"bbox": [250, 400, 350, 420],
482+
"page": 1
461483
}
462484
},
463485
{
464486
"item_attribute_1": {
465487
"confidence": 0.92,
488+
"bbox": [100, 425, 200, 445],
489+
"page": 1
466490
},
467491
"item_attribute_2": {
468492
"confidence": 0.70,
493+
"bbox": [250, 425, 350, 445],
494+
"page": 1
469495
}
470496
}
471497
]
472498
}
473499
474-
IMPORTANT:
475-
- For LIST attributes like "Transactions", assess EACH individual item in the list separately
476-
- Each transaction should be assessed as a separate object in the array
477-
- Do NOT provide aggregate assessments for list items - assess each one individually
478-
- Include assessments for ALL attributes present in the extraction results
500+
IMPORTANT:
501+
- For LIST attributes like "Transactions", assess EACH individual item in the list separately with individual bounding boxes
502+
- Each transaction should be assessed as a separate object in the array with its own spatial coordinates
503+
- Do NOT provide aggregate assessments for list items - assess each one individually with precise locations
504+
- Include assessments AND bounding boxes for ALL attributes present in the extraction results
479505
- Match the exact structure of the extracted data
480-
506+
- Provide page numbers for all bounding boxes (starting from 1)
481507
</final-instructions>
482508
483-
<attributes-definitions>
484-
485-
{ATTRIBUTE_NAMES_AND_DESCRIPTIONS}
486-
487-
</attributes-definitions>
488-
489509
<<CACHEPOINT>>
490510
491511
<document-image>
492-
493512
{DOCUMENT_IMAGE}
494-
495513
</document-image>
496514
497-
498515
<ocr-text-confidence-results>
499-
500516
{OCR_TEXT_CONFIDENCE}
501-
502517
</ocr-text-confidence-results>
503518
504519
<<CACHEPOINT>>
505520
506-
<extraction-results>
521+
<attributes-definitions>
522+
{ATTRIBUTE_NAMES_AND_DESCRIPTIONS}
523+
</attributes-definitions>
507524
525+
<extraction-results>
508526
{EXTRACTION_RESULTS}
509-
510527
</extraction-results>
511528
512529
evaluation:

config_library/pattern-2/lending-package-sample/config.yaml

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1159,7 +1159,7 @@ assessment:
11591159
system_prompt: >-
11601160
You are a document analysis assessment expert. Your role is to evaluate the confidence and accuracy of data extraction results by analyzing them against source documents.
11611161
1162-
Provide accurate confidence scores and clear reasoning for each assessment.
1162+
Provide accurate confidence scores for each assessment.
11631163
When bounding boxes are requested, provide precise coordinate locations where information appears in the document.
11641164
task_prompt: >-
11651165
<background>
@@ -1217,7 +1217,6 @@ assessment:
12171217
{
12181218
"simple_attribute_name": {
12191219
"confidence": 0.85,
1220-
"confidence_reason": "Clear text with high OCR confidence, easily identifiable location",
12211220
"bbox": [100, 200, 300, 250],
12221221
"page": 1
12231222
}
@@ -1228,13 +1227,11 @@ assessment:
12281227
"group_attribute_name": {
12291228
"sub_attribute_1": {
12301229
"confidence": 0.90,
1231-
"confidence_reason": "Very clear text, unambiguous location",
12321230
"bbox": [150, 300, 250, 320],
12331231
"page": 1
12341232
},
12351233
"sub_attribute_2": {
12361234
"confidence": 0.75,
1237-
"confidence_reason": "Good quality but slight formatting ambiguity",
12381235
"bbox": [150, 325, 280, 345],
12391236
"page": 1
12401237
}
@@ -1247,27 +1244,23 @@ assessment:
12471244
{
12481245
"item_attribute_1": {
12491246
"confidence": 0.95,
1250-
"confidence_reason": "Excellent clarity and precise alignment",
12511247
"bbox": [100, 400, 200, 420],
12521248
"page": 1
12531249
},
12541250
"item_attribute_2": {
12551251
"confidence": 0.88,
1256-
"confidence_reason": "Good quality with minor OCR uncertainty",
12571252
"bbox": [250, 400, 350, 420],
12581253
"page": 1
12591254
}
12601255
},
12611256
{
12621257
"item_attribute_1": {
12631258
"confidence": 0.92,
1264-
"confidence_reason": "Clear text, well-positioned",
12651259
"bbox": [100, 425, 200, 445],
12661260
"page": 1
12671261
},
12681262
"item_attribute_2": {
12691263
"confidence": 0.70,
1270-
"confidence_reason": "Readable but some formatting irregularities",
12711264
"bbox": [250, 425, 350, 445],
12721265
"page": 1
12731266
}

0 commit comments

Comments
 (0)