generated from amazon-archives/__template_MIT-0
-
Notifications
You must be signed in to change notification settings - Fork 51
new assesment #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kazmer97
wants to merge
30
commits into
develop
Choose a base branch
from
feat/new-assesment
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+5,747
−5,308
Open
new assesment #137
Changes from all commits
Commits
Show all changes
30 commits
Select commit
Hold shift + click to select a range
8296c52
invoke extraction with long retries
kazmer97 b827110
add review agent model config
kazmer97 3d56e44
fixes
kazmer97 3ff7c62
add review agent model config
kazmer97 7748ed6
new assesment
kazmer97 4442094
typed metadata model
kazmer97 b58c3ef
fixes
kazmer97 5b5ef81
fix tool
kazmer97 6fde894
assesment update
kazmer97 66d1470
further streamlining
kazmer97 ae5f598
fix template
kazmer97 2ed5089
missing dep
kazmer97 aaa8c5d
update config model usage
kazmer97 fd8f49e
strands argument passing update
kazmer97 6eb99b7
update the tests for the config
kazmer97 ae20e28
bug fixes
kazmer97 5881039
make sure model dump uses json mode
kazmer97 c487e30
bbox update
kazmer97 b09b9d2
memory update
kazmer97 a3cf436
cleanup
kazmer97 6001e2c
fix failing test
kazmer97 18f0447
import fix
kazmer97 84dff8d
cleanup: remove artifacts and redundant code from PR review
kazmer97 5497184
update tests to pass
kazmer97 12228a2
fixes
kazmer97 fbbacc6
fix the ruler offset
kazmer97 3b49589
encapsulate ruler
kazmer97 d06c8fe
add structured loggin
kazmer97 24c9d87
improve retry mechanism
kazmer97 1dc8287
small fixes
kazmer97 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -405,11 +405,7 @@ assessment: | |
| image: | ||
| target_height: "" | ||
| target_width: "" | ||
| granular: | ||
| enabled: true | ||
| max_workers: "20" | ||
| simple_batch_size: "3" | ||
| list_batch_size: "1" | ||
| max_workers: "20" | ||
| default_confidence_threshold: "0.8" | ||
| top_p: "0.0" | ||
| max_tokens: "10000" | ||
|
|
@@ -456,107 +452,6 @@ assessment: | |
| - Provide tight, accurate bounding boxes around the actual text | ||
| </assessment-guidelines> | ||
|
|
||
| <spatial-localization-guidelines> | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, won't removing this break the current assessment implementation? |
||
| For each field, provide bounding box coordinates: | ||
| - bbox: [x1, y1, x2, y2] coordinates in normalized 0-1000 scale | ||
| - page: Page number where the field appears (starting from 1) | ||
|
|
||
| Coordinate system: | ||
| - Use normalized scale 0-1000 for both x and y axes | ||
| - x1, y1 = top-left corner of bounding box | ||
| - x2, y2 = bottom-right corner of bounding box | ||
| - Ensure x2 > x1 and y2 > y1 | ||
| - Make bounding boxes tight around the actual text content | ||
| - If a field spans multiple lines, create a bounding box that encompasses all relevant text | ||
| </spatial-localization-guidelines> | ||
|
|
||
| <final-instructions> | ||
| Analyze the extraction results against the source document and provide confidence assessments with spatial localization. Return a JSON object with the following structure based on the attribute type: | ||
|
|
||
| For SIMPLE attributes: | ||
| { | ||
| "simple_attribute_name": { | ||
| "confidence": 0.85, | ||
| "bbox": [100, 200, 300, 250], | ||
| "page": 1 | ||
| } | ||
| } | ||
|
|
||
| For GROUP attributes (nested object structure): | ||
| { | ||
| "group_attribute_name": { | ||
| "sub_attribute_1": { | ||
| "confidence": 0.90, | ||
| "bbox": [150, 300, 250, 320], | ||
| "page": 1 | ||
| }, | ||
| "sub_attribute_2": { | ||
| "confidence": 0.75, | ||
| "bbox": [150, 325, 280, 345], | ||
| "page": 1 | ||
| } | ||
| } | ||
| } | ||
|
|
||
| For LIST attributes (array of assessed items): | ||
| { | ||
| "list_attribute_name": [ | ||
| { | ||
| "item_attribute_1": { | ||
| "confidence": 0.95, | ||
| "bbox": [100, 400, 200, 420], | ||
| "page": 1 | ||
| }, | ||
| "item_attribute_2": { | ||
| "confidence": 0.88, | ||
| "bbox": [250, 400, 350, 420], | ||
| "page": 1 | ||
| } | ||
| }, | ||
| { | ||
| "item_attribute_1": { | ||
| "confidence": 0.92, | ||
| "bbox": [100, 425, 200, 445], | ||
| "page": 1 | ||
| }, | ||
| "item_attribute_2": { | ||
| "confidence": 0.70, | ||
| "bbox": [250, 425, 350, 445], | ||
| "page": 1 | ||
| } | ||
| } | ||
| ] | ||
| } | ||
|
|
||
| IMPORTANT: | ||
| - For LIST attributes like "Transactions", assess EACH individual item in the list separately with individual bounding boxes | ||
| - Each transaction should be assessed as a separate object in the array with its own spatial coordinates | ||
| - Do NOT provide aggregate assessments for list items - assess each one individually with precise locations | ||
| - Include assessments AND bounding boxes for ALL attributes present in the extraction results | ||
| - Match the exact structure of the extracted data | ||
| - Provide page numbers for all bounding boxes (starting from 1) | ||
| </final-instructions> | ||
|
|
||
| <<CACHEPOINT>> | ||
|
|
||
| <document-image> | ||
| {DOCUMENT_IMAGE} | ||
| </document-image> | ||
|
|
||
| <ocr-text-confidence-results> | ||
| {OCR_TEXT_CONFIDENCE} | ||
| </ocr-text-confidence-results> | ||
|
|
||
| <<CACHEPOINT>> | ||
|
|
||
| <attributes-definitions> | ||
| {ATTRIBUTE_NAMES_AND_DESCRIPTIONS} | ||
| </attributes-definitions> | ||
|
|
||
| <extraction-results> | ||
| {EXTRACTION_RESULTS} | ||
| </extraction-results> | ||
|
|
||
| evaluation: | ||
| enabled: true | ||
| llm_method: | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look backward compatible. Is it? We do not want to break the existing granular assessment. Or do you propose that we replace 'granular assessment' (our current default) with 'agentic assessment'?