Skip to content

Commit 55c505b

Browse files
author
Bob Strahan
committed
Add {DOCUMENT_IMAGE} placeholder to classification and extraction prompt, and update relevant config_library configurations
1 parent 8a76072 commit 55c505b

File tree

4 files changed

+94
-799
lines changed

4 files changed

+94
-799
lines changed

config_library/pattern-2/default/config.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -485,6 +485,10 @@ extraction:
485485
<document-text>
486486
{DOCUMENT_TEXT}
487487
</document-text>
488+
489+
<document_image>
490+
{DOCUMENT_IMAGE}
491+
</document_image>
488492
489493
<final-instructions>
490494
Extract key information from the document and return a JSON object with the following key steps:

config_library/pattern-2/few_shot_example_with_multimodal_page_classification/config.yaml

Lines changed: 55 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -653,26 +653,24 @@ classification:
653653
task_prompt: >-
654654
Classify this document into exactly one of these categories:
655655
656-
657656
{CLASS_NAMES_AND_DESCRIPTIONS}
658657
659-
660-
Respond only with a JSON object containing the class label. For example:
661-
{{"class": "letter"}}
662-
663658
<few_shot_examples>
664-
665659
{FEW_SHOT_EXAMPLES}
666-
667660
</few_shot_examples>
668661
669662
<<CACHEPOINT>>
670663
671664
<document_ocr_data>
672-
673665
{DOCUMENT_TEXT}
674-
675666
</document_ocr_data>
667+
668+
<document_image>
669+
{DOCUMENT_IMAGE}
670+
</document_image>
671+
672+
Respond only with a JSON object containing the class label. For example:
673+
{{"class": "letter"}}
676674
extraction:
677675
model: us.amazon.nova-pro-v1:0
678676
temperature: '0.0'
@@ -684,72 +682,65 @@ extraction:
684682
only provide data found in the document being provided.
685683
task_prompt: >
686684
<background>
687-
688-
You are an expert in business document analysis and information extraction.
689-
690-
You can understand and extract key information from business documents.
685+
You are an expert in document analysis and information extraction.
686+
You can understand and extract key information from documents classified as type
687+
{DOCUMENT_CLASS}.
688+
</background>
691689
692690
<task>
691+
Your task is to take the unstructured text provided and convert it into a well-organized table format using JSON. Identify the main entities, attributes, or categories mentioned in the attributes list below and use them as keys in the JSON object.
692+
Then, extract the relevant information from the text and populate the corresponding values in the JSON object.
693+
</task>
693694
694-
Your task is to take the unstructured text provided and convert it into a
695-
696-
well-organized table format using JSON. Identify the main entities,
697-
698-
attributes, or categories mentioned in the attributes list below and use
699-
700-
them as keys in the JSON object.
701-
702-
Then, extract the relevant information from the text and populate the
703-
704-
corresponding values in the JSON object.
705-
695+
<extraction-guidelines>
706696
Guidelines:
707-
708-
Ensure that the data is accurately represented and properly formatted within
709-
the JSON structure
710-
711-
Include double quotes around all keys and values
712-
713-
Do not make up data - only extract information explicitly found in the
714-
document
715-
716-
Do not use /n for new lines, use a space instead
717-
718-
If a field is not found or if unsure, return null
719-
720-
All dates should be in MM/DD/YYYY format
721-
722-
Do not perform calculations or summations unless totals are explicitly given
723-
724-
If an alias is not found in the document, return null
725-
726-
Here are the attributes you should extract:
697+
1. Ensure that the data is accurately represented and properly formatted within
698+
the JSON structure
699+
2. Include double quotes around all keys and values
700+
3. Do not make up data - only extract information explicitly found in the
701+
document
702+
4. Do not use /n for new lines, use a space instead
703+
5. If a field is not found or if unsure, return null
704+
6. All dates should be in MM/DD/YYYY format
705+
7. Do not perform calculations or summations unless totals are explicitly given
706+
8. If an alias is not found in the document, return null
707+
9. Guidelines for checkboxes:
708+
9.A. CAREFULLY examine each checkbox, radio button, and selection field:
709+
- Look for marks like ✓, ✗, x, filled circles (●), darkened areas, or handwritten checks indicating selection
710+
- For checkboxes and multi-select fields, ONLY INCLUDE options that show clear visual evidence of selection
711+
- DO NOT list options that have no visible selection mark
712+
9.B. For ambiguous or overlapping tick marks:
713+
- If a mark overlaps between two or more checkboxes, determine which option contains the majority of the mark
714+
- Consider a checkbox selected if the mark is primarily inside the check box or over the option text
715+
- When a mark touches multiple options, analyze which option was most likely intended based on position and density. For handwritten checks, the mark typically flows from the selected checkbox outward.
716+
- Carefully analyze visual cues and contextual hints. Think from a human perspective, anticipate natural tendencies, and apply thoughtful reasoning to make the best possible judgment.
717+
10. Think step by step first and then answer.
718+
</extraction-guidelines>
727719
728720
<attributes>
729-
730721
{ATTRIBUTE_NAMES_AND_DESCRIPTIONS}
731-
732722
</attributes>
733723
734-
<few_shot_examples>
735-
736-
{FEW_SHOT_EXAMPLES}
737-
738-
</few_shot_examples>
739-
740-
</task>
741-
742-
</background>
743-
744-
<<CACHEPOINT>>
745-
746-
The document tpe is {DOCUMENT_CLASS}. Here is the document content:
747-
748-
<document_ocr_data>
724+
<<CACHEPOINT>>
749725
726+
<document-text>
750727
{DOCUMENT_TEXT}
751-
752-
</document_ocr_data>
728+
</document-text>
729+
730+
<document_image>
731+
{DOCUMENT_IMAGE}
732+
</document_image>
733+
734+
<final-instructions>
735+
Extract key information from the document and return a JSON object with the following key steps:
736+
1. Carefully analyze the document text to identify the requested attributes
737+
2. Extract only information explicitly found in the document - never make up data
738+
3. Format all dates as MM/DD/YYYY and replace newlines with spaces
739+
4. For checkboxes, only include options with clear visual selection marks
740+
5. Use null for any fields not found in the document
741+
6. Ensure the output is properly formatted JSON with quoted keys and values
742+
7. Think step by step before finalizing your answer
743+
</final-instructions>
753744
pricing:
754745
- name: textract/detect_document_text
755746
units:

0 commit comments

Comments
 (0)