Skip to content

Commit e439791

Browse files
author
Bob Strahan
committed
Add granular assessment notebook with bounding box extraction
1 parent 4a692d7 commit e439791

File tree

9 files changed

+1328
-110
lines changed

9 files changed

+1328
-110
lines changed

config_library/pattern-2/lending-package-sample/config.yaml

Lines changed: 69 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1150,134 +1150,160 @@ assessment:
11501150
max_workers: "20"
11511151
simple_batch_size: "3"
11521152
list_batch_size: "1"
1153+
bounding_boxes:
1154+
enabled: false
11531155
default_confidence_threshold: '0.8'
11541156
top_p: '0.1'
11551157
max_tokens: '10000'
11561158
top_k: '5'
11571159
temperature: '0.0'
11581160
model: us.amazon.nova-lite-v1:0
11591161
system_prompt: >-
1160-
You are a document analysis assessment expert. Your task is to evaluate the confidence of extraction results by analyzing the source document evidence. Respond only with JSON containing confidence scores for each extracted attribute.
1162+
You are a document analysis assessment expert. Your role is to evaluate the confidence and accuracy of data extraction results by analyzing them against source documents.
1163+
1164+
Provide accurate confidence scores and clear reasoning for each assessment.
1165+
When bounding boxes are requested, provide precise coordinate locations where information appears in the document.
11611166
task_prompt: >-
11621167
<background>
1163-
1164-
You are an expert document analysis assessment system. Your task is to evaluate the confidence of extraction results for a document of class {DOCUMENT_CLASS}.
1165-
1168+
You are an expert document analysis assessment system. Your task is to evaluate the confidence of extraction results for a document of class {DOCUMENT_CLASS} and provide precise spatial localization for each field.
11661169
</background>
11671170
1168-
11691171
<task>
1170-
1171-
Analyze the extraction results against the source document and provide confidence assessments for each extracted attribute. Consider factors such as:
1172-
1173-
1. Text clarity and OCR quality in the source regions
1174-
2. Alignment between extracted values and document content
1175-
3. Presence of clear evidence supporting the extraction
1176-
4. Potential ambiguity or uncertainty in the source material
1172+
Analyze the extraction results against the source document and provide confidence assessments AND bounding box coordinates for each extracted attribute. Consider factors such as:
1173+
1. Text clarity and OCR quality in the source regions
1174+
2. Alignment between extracted values and document content
1175+
3. Presence of clear evidence supporting the extraction
1176+
4. Potential ambiguity or uncertainty in the source material
11771177
5. Completeness and accuracy of the extracted information
1178-
1178+
6. Precise spatial location of each field in the document
11791179
</task>
11801180
1181-
11821181
<assessment-guidelines>
1183-
1184-
For each attribute, provide:
1185-
A confidence score between 0.0 and 1.0 where:
1182+
For each attribute, provide:
1183+
- A confidence score between 0.0 and 1.0 where:
11861184
- 1.0 = Very high confidence, clear and unambiguous evidence
11871185
- 0.8-0.9 = High confidence, strong evidence with minor uncertainty
11881186
- 0.6-0.7 = Medium confidence, reasonable evidence but some ambiguity
11891187
- 0.4-0.5 = Low confidence, weak or unclear evidence
11901188
- 0.0-0.3 = Very low confidence, little to no supporting evidence
1191-
1192-
Guidelines:
1193-
- Base assessments on actual document content and OCR quality
1194-
- Consider both text-based evidence and visual/layout clues
1195-
- Account for OCR confidence scores when provided
1196-
- Be objective and specific in reasoning
1189+
- A clear explanation of the confidence reasoning
1190+
- Precise spatial coordinates where the field appears in the document
1191+
1192+
Guidelines:
1193+
- Base assessments on actual document content and OCR quality
1194+
- Consider both text-based evidence and visual/layout clues
1195+
- Account for OCR confidence scores when provided
1196+
- Be objective and specific in reasoning
11971197
- If an extraction appears incorrect, score accordingly with explanation
1198-
1198+
- Provide tight, accurate bounding boxes around the actual text
11991199
</assessment-guidelines>
12001200
1201-
<final-instructions>
1201+
<spatial-localization-guidelines>
1202+
For each field, provide bounding box coordinates:
1203+
- bbox: [x1, y1, x2, y2] coordinates in normalized 0-1000 scale
1204+
- page: Page number where the field appears (starting from 1)
1205+
1206+
Coordinate system:
1207+
- Use normalized scale 0-1000 for both x and y axes
1208+
- x1, y1 = top-left corner of bounding box
1209+
- x2, y2 = bottom-right corner of bounding box
1210+
- Ensure x2 > x1 and y2 > y1
1211+
- Make bounding boxes tight around the actual text content
1212+
- If a field spans multiple lines, create a bounding box that encompasses all relevant text
1213+
</spatial-localization-guidelines>
12021214
1203-
Analyze the extraction results against the source document and provide confidence assessments. Return a JSON object with the following structure based on the attribute type:
1215+
<final-instructions>
1216+
Analyze the extraction results against the source document and provide confidence assessments with spatial localization. Return a JSON object with the following structure based on the attribute type:
12041217
1205-
For SIMPLE attributes:
1218+
For SIMPLE attributes:
12061219
{
12071220
"simple_attribute_name": {
12081221
"confidence": 0.85,
1222+
"confidence_reason": "Clear text with high OCR confidence, easily identifiable location",
1223+
"bbox": [100, 200, 300, 250],
1224+
"page": 1
12091225
}
12101226
}
12111227
1212-
For GROUP attributes (nested object structure):
1228+
For GROUP attributes (nested object structure):
12131229
{
12141230
"group_attribute_name": {
12151231
"sub_attribute_1": {
12161232
"confidence": 0.90,
1233+
"confidence_reason": "Very clear text, unambiguous location",
1234+
"bbox": [150, 300, 250, 320],
1235+
"page": 1
12171236
},
12181237
"sub_attribute_2": {
12191238
"confidence": 0.75,
1239+
"confidence_reason": "Good quality but slight formatting ambiguity",
1240+
"bbox": [150, 325, 280, 345],
1241+
"page": 1
12201242
}
12211243
}
12221244
}
12231245
1224-
For LIST attributes (array of assessed items):
1246+
For LIST attributes (array of assessed items):
12251247
{
12261248
"list_attribute_name": [
12271249
{
12281250
"item_attribute_1": {
12291251
"confidence": 0.95,
1252+
"confidence_reason": "Excellent clarity and precise alignment",
1253+
"bbox": [100, 400, 200, 420],
1254+
"page": 1
12301255
},
12311256
"item_attribute_2": {
12321257
"confidence": 0.88,
1258+
"confidence_reason": "Good quality with minor OCR uncertainty",
1259+
"bbox": [250, 400, 350, 420],
1260+
"page": 1
12331261
}
12341262
},
12351263
{
12361264
"item_attribute_1": {
12371265
"confidence": 0.92,
1266+
"confidence_reason": "Clear text, well-positioned",
1267+
"bbox": [100, 425, 200, 445],
1268+
"page": 1
12381269
},
12391270
"item_attribute_2": {
12401271
"confidence": 0.70,
1272+
"confidence_reason": "Readable but some formatting irregularities",
1273+
"bbox": [250, 425, 350, 445],
1274+
"page": 1
12411275
}
12421276
}
12431277
]
12441278
}
12451279
1246-
IMPORTANT:
1247-
- For LIST attributes like "Transactions", assess EACH individual item in the list separately
1248-
- Each transaction should be assessed as a separate object in the array
1249-
- Do NOT provide aggregate assessments for list items - assess each one individually
1250-
- Include assessments for ALL attributes present in the extraction results
1280+
IMPORTANT:
1281+
- For LIST attributes like "Transactions", assess EACH individual item in the list separately with individual bounding boxes
1282+
- Each transaction should be assessed as a separate object in the array with its own spatial coordinates
1283+
- Do NOT provide aggregate assessments for list items - assess each one individually with precise locations
1284+
- Include assessments AND bounding boxes for ALL attributes present in the extraction results
12511285
- Match the exact structure of the extracted data
1252-
1286+
- Provide page numbers for all bounding boxes (starting from 1)
12531287
</final-instructions>
12541288
12551289
<<CACHEPOINT>>
12561290
12571291
<document-image>
1258-
12591292
{DOCUMENT_IMAGE}
1260-
12611293
</document-image>
12621294
12631295
<ocr-text-confidence-results>
1264-
12651296
{OCR_TEXT_CONFIDENCE}
1266-
12671297
</ocr-text-confidence-results>
12681298
12691299
<<CACHEPOINT>>
12701300
12711301
<attributes-definitions>
1272-
12731302
{ATTRIBUTE_NAMES_AND_DESCRIPTIONS}
1274-
12751303
</attributes-definitions>
12761304
12771305
<extraction-results>
1278-
12791306
{EXTRACTION_RESULTS}
1280-
12811307
</extraction-results>
12821308
evaluation:
12831309
llm_method:

docs/assessment-bounding-boxes.md

Lines changed: 88 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,9 @@ The Assessment Service now supports **optional bounding box extraction** as part
2323

2424
### Output Format
2525

26-
When bounding boxes are enabled, the assessment output includes `geometry` arrays:
26+
When bounding boxes are enabled, the assessment output includes `geometry` arrays for all attribute types:
2727

28+
**Simple Attributes:**
2829
```json
2930
{
3031
"account_number": {
@@ -34,10 +35,10 @@ When bounding boxes are enabled, the assessment output includes `geometry` array
3435
"geometry": [
3536
{
3637
"boundingBox": {
37-
"top": 0.3751128193254686,
38-
"left": 0.4474376978868207,
39-
"width": 0.05959462312246394,
40-
"height": 0.010745484576798636
38+
"top": 0.375,
39+
"left": 0.447,
40+
"width": 0.059,
41+
"height": 0.010
4142
},
4243
"page": 1
4344
}
@@ -46,6 +47,88 @@ When bounding boxes are enabled, the assessment output includes `geometry` array
4647
}
4748
```
4849

50+
**Group Attributes (Nested):**
51+
```json
52+
{
53+
"CompanyAddress": {
54+
"State": {
55+
"confidence": 0.99,
56+
"confidence_reason": "Clear text with high OCR confidence",
57+
"confidence_threshold": 0.9,
58+
"geometry": [
59+
{
60+
"boundingBox": {
61+
"top": 0.116,
62+
"left": 0.23,
63+
"width": 0.029,
64+
"height": 0.01
65+
},
66+
"page": 1
67+
}
68+
]
69+
},
70+
"ZipCode": {
71+
"confidence": 0.99,
72+
"confidence_reason": "Clear text with high OCR confidence",
73+
"confidence_threshold": 0.9,
74+
"geometry": [
75+
{
76+
"boundingBox": {
77+
"top": 0.116,
78+
"left": 0.261,
79+
"width": 0.037,
80+
"height": 0.01
81+
},
82+
"page": 1
83+
}
84+
]
85+
}
86+
}
87+
}
88+
```
89+
90+
**List Attributes:**
91+
```json
92+
{
93+
"Transactions": [
94+
{
95+
"Date": {
96+
"confidence": 0.95,
97+
"confidence_reason": "Clear date format",
98+
"confidence_threshold": 0.9,
99+
"geometry": [
100+
{
101+
"boundingBox": {
102+
"top": 0.2,
103+
"left": 0.1,
104+
"width": 0.05,
105+
"height": 0.02
106+
},
107+
"page": 1
108+
}
109+
]
110+
},
111+
"Amount": {
112+
"confidence": 0.88,
113+
"confidence_reason": "Good number format",
114+
"confidence_threshold": 0.9,
115+
"geometry": [
116+
{
117+
"boundingBox": {
118+
"top": 0.2,
119+
"left": 0.2,
120+
"width": 0.05,
121+
"height": 0.02
122+
},
123+
"page": 1
124+
}
125+
]
126+
}
127+
}
128+
]
129+
}
130+
```
131+
49132
## Configuration
50133

51134
### Basic Configuration

0 commit comments

Comments
 (0)