You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/classification.md
+9Lines changed: 9 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -67,6 +67,15 @@ classification:
67
67
</document-text>
68
68
```
69
69
70
+
## Limitations of Text-Based Holistic Classification
71
+
72
+
Despite its strengths in handling full-document context, this method has several limitations:
73
+
74
+
- **Context Limitations**: Passing the full document text to the model can exceed the context window, especially for long documents. This restricts use to models that support large context sizes.
75
+
- **Hallucination Risk**: When processing lengthy inputs, the model may generate inaccurate or inconsistent classifications due to diluted focus across pages.
76
+
- **Model Dependency**: Requires high-context models such as Amazon Nova Premier supports up to 1 million tokens. Smaller models are not suitable for processing long document packages effectively.
77
+
- **Scalability Challenges**: Not ideal for very large or visually complex document sets. In such cases, the Multi-Modal Page-Level Classification method is more appropriate.
78
+
70
79
#### MultiModal Page-Level Classification with Few-Shot Examples
71
80
72
81
- Classifies each page independently using both text and image data
0 commit comments