You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -104,36 +104,37 @@ The format varies by OCR backend:
104
104
**Textract Backend (with confidence data):**
105
105
```json
106
106
{
107
-
"page_count": 1,
108
-
"text_blocks": [
109
-
{
110
-
"text": "WESTERN DARK FIRED TOBACCO GROWERS' ASSOCIATION",
111
-
"confidence": 99.35,
112
-
"type": "PRINTED"
113
-
},
114
-
{
115
-
"text": "206 Maple Street",
116
-
"confidence": 91.41,
117
-
"type": "PRINTED"
118
-
}
119
-
]
107
+
"text": "| Text | Confidence |\n|------|------------|\n| WESTERN DARK FIRED TOBACCO GROWERS' ASSOCIATION | 99.4 |\n| 206 Maple Street | 91.4 |\n| Murray, KY 42071 | 98.7 |"
108
+
}
109
+
```
110
+
111
+
The `text` field contains a markdown table with two columns:
112
+
-**Text**: The extracted text content (with pipe characters escaped as `\|`)
113
+
-**Confidence**: OCR confidence score rounded to 1 decimal point
114
+
- Handwriting is indicated with "(HANDWRITING)" suffix in the text column
115
+
116
+
**Bedrock Backend (no confidence data):**
117
+
```json
118
+
{
119
+
"text": "| Text | Confidence |\n|------|------------|\n| *No confidence data available from LLM OCR* | N/A |"
0 commit comments