Skip to content

Commit 9fd7fdf

Browse files
committed
Merge branch 'feature/100-image-limit' into 'develop'
Increase Page Image Limit from 20 to 100 See merge request genaiic-reusable-assets/engagement-artifacts/genaiic-idp-accelerator!441
2 parents 3322198 + 496e865 commit 9fd7fdf

File tree

13 files changed

+40
-36
lines changed

13 files changed

+40
-36
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@ SPDX-License-Identifier: MIT-0
55

66
## [Unreleased]
77

8+
### Changed
9+
- Increased page image limit from 20 to 100 across all IDP services (classification, extraction, assessment) to support processing of longer document sections with large context models following recent Amazon Bedrock API limit increases
10+
- Resolves #147
11+
812
## [0.4.5]
913

1014
### Added

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.4.5
1+
0.4.6-wip1

docs/classification.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -403,7 +403,7 @@ classification:
403403

404404
For documents with multiple pages, the system automatically handles image limits:
405405

406-
- **Bedrock Limit**: Maximum 20 images per request (automatically enforced)
406+
- **Bedrock Limit**: Maximum 100 images per request (automatically enforced)
407407
- **Warning Logging**: System logs warnings when images are truncated due to limits
408408
- **Smart Handling**: Images are processed in page order, with excess images automatically dropped
409409

docs/extraction.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -334,7 +334,7 @@ extraction:
334334
For documents with multiple pages, the system provides robust image management:
335335

336336
- **Automatic Pagination**: Images are processed in page order
337-
- **Bedrock Compliance**: Maximum 20 images per request (automatically enforced)
337+
- **Bedrock Compliance**: Maximum 100 images per request (automatically enforced)
338338
- **Smart Truncation**: Excess images are dropped with warning logs
339339
- **Performance Optimization**: Large image sets are efficiently handled
340340

@@ -346,7 +346,7 @@ extraction:
346346
347347
{ATTRIBUTE_NAMES_AND_DESCRIPTIONS}
348348
349-
Document pages (up to 20 images):
349+
Document pages (up to 100 images):
350350
{DOCUMENT_IMAGE}
351351
352352
Combined text from all pages:

docs/idp-configuration-best-practices.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1256,7 +1256,7 @@ classification:
12561256
For documents with multiple pages, the system provides robust image management:
12571257

12581258
- **Automatic Pagination**: Images are processed in page order
1259-
- **Bedrock Compliance**: Maximum 20 images per request (automatically enforced)
1259+
- **Bedrock Compliance**: Maximum 100 images per request (automatically enforced)
12601260
- **Smart Truncation**: Excess images are dropped with warning logs
12611261
- **Performance Optimization**: Large image sets are efficiently handled
12621262

lib/idp_common_pkg/idp_common/assessment/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -305,7 +305,7 @@ assess each extracted field:
305305

306306
### Automatic Image Handling
307307
- Supports both single and multiple document images
308-
- Automatically limits to 20 images per Bedrock constraints
308+
- Automatically limits to 100 images per Bedrock constraints
309309
- Graceful fallback when images are unavailable
310310

311311
## Attribute Types and Assessment Formats

lib/idp_common_pkg/idp_common/assessment/granular_service.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -472,13 +472,13 @@ def _build_cached_prompt_base(
472472
# Add the images if available
473473
if page_images:
474474
if isinstance(page_images, list):
475-
# Multiple images (limit to 20 as per Bedrock constraints)
476-
if len(page_images) > 20:
475+
# Multiple images (limit to 100 as per Bedrock constraints)
476+
if len(page_images) > 100:
477477
logger.warning(
478-
f"Found {len(page_images)} images, truncating to 20 due to Bedrock constraints. "
479-
f"{len(page_images) - 20} images will be dropped."
478+
f"Found {len(page_images)} images, truncating to 100 due to Bedrock constraints. "
479+
f"{len(page_images) - 100} images will be dropped."
480480
)
481-
for img in page_images[:20]:
481+
for img in page_images[:100]:
482482
content.append(image.prepare_bedrock_image_attachment(img))
483483
else:
484484
# Single image

lib/idp_common_pkg/idp_common/assessment/service.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -428,13 +428,13 @@ def _build_content_with_image_placeholder(
428428
# Add the image if available
429429
if image_content:
430430
if isinstance(image_content, list):
431-
# Multiple images (limit to 20 as per Bedrock constraints)
432-
if len(image_content) > 20:
431+
# Multiple images (limit to 100 as per Bedrock constraints)
432+
if len(image_content) > 100:
433433
logger.warning(
434-
f"Found {len(image_content)} images, truncating to 20 due to Bedrock constraints. "
435-
f"{len(image_content) - 20} images will be dropped."
434+
f"Found {len(image_content)} images, truncating to 100 due to Bedrock constraints. "
435+
f"{len(image_content) - 100} images will be dropped."
436436
)
437-
for img in image_content[:20]:
437+
for img in image_content[:100]:
438438
content.append(image.prepare_bedrock_image_attachment(img))
439439
else:
440440
# Single image

lib/idp_common_pkg/idp_common/extraction/agentic_idp.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -735,20 +735,20 @@ def _prepare_prompt_content(
735735
else:
736736
prompt_content = [ContentBlock(text=str(prompt))]
737737

738-
# Add page images if provided (limit to 20 as per Bedrock constraints)
738+
# Add page images if provided (limit to 100 as per Bedrock constraints)
739739
if page_images:
740-
if len(page_images) > 20:
740+
if len(page_images) > 100:
741741
prompt_content.append(
742742
ContentBlock(
743-
text=f"There are {len(page_images)} images, initially you'll see 20 of them, use the view_image tool to see the rest."
743+
text=f"There are {len(page_images)} images, initially you'll see 100 of them, use the view_image tool to see the rest."
744744
)
745745
)
746746

747747
prompt_content += [
748748
ContentBlock(
749749
image=ImageContent(format="png", source=ImageSource(bytes=img_bytes))
750750
)
751-
for img_bytes in page_images[:20]
751+
for img_bytes in page_images[:100]
752752
]
753753

754754
# Add existing data context if provided

lib/idp_common_pkg/idp_common/extraction/service.py

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -148,13 +148,13 @@ def _get_default_prompt_content(self) -> list[dict[str, Any]]:
148148
"""
149149
content = [{"text": task_prompt}]
150150

151-
# Add image attachments to the content (limit to 20 images as per Bedrock constraints)
151+
# Add image attachments to the content (limit to 100 images as per Bedrock constraints)
152152
if self._page_images:
153153
logger.info(
154154
f"Attaching images to default prompt, for {len(self._page_images)} pages."
155155
)
156-
# Limit to 20 images as per Bedrock constraints
157-
for img in self._page_images[:20]:
156+
# Limit to 100 images as per Bedrock constraints
157+
for img in self._page_images[:100]:
158158
content.append(image.prepare_bedrock_image_attachment(img))
159159

160160
return content
@@ -354,7 +354,7 @@ def _build_text_and_image_content(
354354

355355
def _prepare_image_attachments(self, image_content: Any) -> list[dict[str, Any]]:
356356
"""
357-
Prepare image attachments for Bedrock, limiting to 20 images.
357+
Prepare image attachments for Bedrock, limiting to 100 images.
358358
359359
Args:
360360
image_content: Single image or list of images
@@ -365,13 +365,13 @@ def _prepare_image_attachments(self, image_content: Any) -> list[dict[str, Any]]
365365
attachments: list[dict[str, Any]] = []
366366

367367
if isinstance(image_content, list):
368-
# Multiple images (limit to 20 as per Bedrock constraints)
369-
if len(image_content) > 20:
368+
# Multiple images (limit to 100 as per Bedrock constraints)
369+
if len(image_content) > 100:
370370
logger.warning(
371-
f"Found {len(image_content)} images, truncating to 20 due to Bedrock constraints. "
372-
f"{len(image_content) - 20} images will be dropped."
371+
f"Found {len(image_content)} images, truncating to 100 due to Bedrock constraints. "
372+
f"{len(image_content) - 100} images will be dropped."
373373
)
374-
for img in image_content[:20]:
374+
for img in image_content[:100]:
375375
attachments.append(image.prepare_bedrock_image_attachment(img))
376376
else:
377377
# Single image

0 commit comments

Comments
 (0)