You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/evaluation.md
+6-8Lines changed: 6 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,26 +49,24 @@ The evaluation framework automatically integrates with the assessment feature to
49
49
50
50
The evaluation framework automatically extracts confidence scores from the `explainability_info` section of assessment results and displays them in both JSON and Markdown evaluation reports:
51
51
52
-
-**Expected Confidence**: Confidence score for baseline/ground truth data (if assessed)
53
-
-**Actual Confidence**: Confidence score for extraction results being evaluated
52
+
-**Confidence**: Confidence score for extraction results being evaluated
54
53
55
54
### Enhanced Evaluation Reports
56
55
57
56
When confidence data is available, evaluation reports include additional columns:
58
57
59
58
```
60
-
| Status | Attribute | Expected | Actual | Expected Confidence | Actual Confidence | Score | Method | Reason |
" document_id section_id section_type attribute_name expected actual matched score reason evaluation_method expected_confidence actual_confidence evaluation_date year month day document\n",
316
+
" document_id section_id section_type attribute_name expected actual matched score reason evaluation_method expected_confidence confidence evaluation_date year month day document\n",
317
317
"0 rvl_cdip_package.pdf 1 letter cc true 1.0 Both actual and expected values are missing, so they are matched. LLM 0.0 0.0 2025-06-10 22:08:58.185 2025 06 10 rvl_cdip_package.pdf\n",
318
318
"1 rvl_cdip_package.pdf 1 letter date 10/31/1995 10/31/1995 true 1.0 The expected and actual values for the 'date' attribute are identical, representing the same date of 10/31/1995. The formatting and representation are exactly the same, so there is a perfect match. LLM 0.85 0.85 2025-06-10 22:08:58.185 2025 06 10 rvl_cdip_package.pdf\n",
319
319
"2 rvl_cdip_package.pdf 1 letter letter_type Opposition Opposition true 1.0 The expected value 'Opposition' and the actual value 'Opposition' are an exact match in meaning, taking into account formatting, word order, and semantic equivalence. LLM 0.9 0.9 2025-06-10 22:08:58.185 2025 06 10 rvl_cdip_package.pdf\n",
0 commit comments