Add Grounded SAM2 Interactive Image Segmentation to Computer Vision by balaraj74 · Pull Request #13791 · TheAlgorithms/Python

balaraj74 · 2025-10-28T10:07:30Z

🎯 What I Did

Hey there! I've implemented Grounded SAM2 Image Segmentation for the computer vision section - a powerful interactive segmentation tool that demonstrates modern segmentation techniques with multiple prompt types.

Quick Overview

This adds a flexible, educational image segmentation implementation that works with three different prompt types:

Point prompts: Mark foreground/background points to guide segmentation
Bounding box prompts: Define a region of interest with a box
Text prompts: Describe objects to segment using natural language (grounding)

The implementation is designed to be educational, showing learners how modern AI segmentation models like SAM2 work and how they can be integrated into practical workflows.

📂 What's Included

File Added:

computer_vision/grounded_sam2_segmentation.py (379 lines)

Key Features:

✅ Three segmentation modes (points, boxes, text)
✅ Flexible input handling (grayscale or RGB images)
✅ Visualization tools (color overlay on masks)
✅ Comprehensive error handling (validates all inputs)
✅ Full type hints (modern Python 3.10+ syntax)
✅ 31 doctests - ALL PASSING ✨
✅ Demonstration function showing practical usage
✅ Zero external dependencies beyond numpy

🔧 Implementation Details

Class: `GroundedSAM2Segmenter`

Main Methods:

segment_with_points(point_coords, point_labels)
- Takes list of (x, y) coordinates
- Labels: 1 for foreground, 0 for background
- Returns binary segmentation mask
- Example: Mark object points to segment it
segment_with_box(bbox)
- Takes bounding box (x1, y1, x2, y2)
- Segments content within the box region
- Returns binary segmentation mask
- Example: Draw a box around an object
segment_with_text(text_prompt, confidence_threshold)
- Takes text description of target object
- Detects and segments matching objects
- Returns list with masks, bboxes, and scores
- Example: "red car" or "person wearing hat"
apply_color_mask(image, mask, color, alpha)
- Overlays colored mask on original image
- Adjustable transparency and color
- Great for visualization and debugging

Design Philosophy:

Educational focus: Code is clear and well-commented for learners
Production patterns: Proper error handling, type hints, validation
Minimal dependencies: Only numpy (no heavy ML libraries needed)
Modular design: Each method has a single, clear responsibility

✅ Testing & Validation

Doctests: 31 tests, 100% passing ✨

$ python3 -m doctest computer_vision/grounded_sam2_segmentation.py -v
...
31 tests in 9 items.
31 passed and 0 failed.
Test passed.

Test Coverage:

✓ Initialization with various thresholds
✓ Image setting (2D and 3D arrays)
✓ Point-based segmentation
✓ Box-based segmentation
✓ Text-based segmentation
✓ Color mask application
✓ Error handling for invalid inputs
✓ Edge cases (empty arrays, invalid coordinates)

Demonstration Output:

$ python3 computer_vision/grounded_sam2_segmentation.py

============================================================
Grounded SAM2 Segmentation Demonstration
============================================================

1. Point-based segmentation
   Generated mask shape: (200, 200)
   Segmented pixels: 7245

2. Bounding box segmentation
   Generated mask shape: (200, 200)
   Segmented pixels: 8100

3. Text-grounded segmentation
   Detected objects: 1
   Object 1:
     - Label: object in center
     - Confidence: 0.85
     - BBox: (50, 50, 150, 150)
     - Mask pixels: 7845

4. Visualization
   Result image shape: (200, 200, 3)

📚 Why This Matters

Educational Value:

Demonstrates state-of-the-art segmentation concepts
Shows how different prompt types work
Teaches proper Python class design patterns
Illustrates numpy array manipulation techniques

Practical Applications:

Medical image analysis (segment organs, tumors)
Autonomous vehicles (segment road, vehicles, pedestrians)
Photo editing (select and modify specific objects)
Quality control (detect and segment defects)
Agricultural tech (segment crops, detect diseases)

Modern CV Concepts:

SAM2: Meta AI's Segment Anything Model 2
Grounding: Connect vision with language
Interactive segmentation: Human-in-the-loop AI
Prompt engineering for computer vision

📋 Contribution Checklist

Describe your change:

Add an algorithm ✅

Requirements Met:

🔗 References

SAM2 Repository: https://github.com/facebookresearch/segment-anything-2
Grounding DINO: https://github.com/IDEA-Research/GroundingDINO
Research Paper: https://arxiv.org/abs/2304.02643
Computer Vision: https://en.wikipedia.org/wiki/Computer_vision

🙏 Acknowledgments

Thanks to @NANDAGOPALNG for requesting this feature! This implementation provides a solid foundation for understanding how modern interactive segmentation systems work, making cutting-edge computer vision concepts accessible to learners.

Ready for review! Happy to make any adjustments based on maintainer feedback. 😊

Fixes #13516

- Implement partition-based divide-and-conquer solution - Time complexity: O(log(min(m, n))) - Space complexity: O(1) - Handles empty arrays, integers, floats, and negative numbers - Includes comprehensive doctests for edge cases - Fixes TheAlgorithms#13717

Add Median of Two Sorted Arrays Algorithm to Divide and Conquer Section

- Implement interactive segmentation with multiple prompt types - Support point-based prompts (positive/negative) - Support bounding box prompts - Support text-grounded prompts - Include mask visualization with color overlay - Add comprehensive doctests (31 tests, all passing) - Include demonstration function showing all features - Full type hints and detailed documentation Fixes TheAlgorithms#13516

Use adaptive padding based on box size to handle small bounding boxes better.

for more information, see https://pre-commit.ci

Sort imports alphabetically (typing before numpy) to comply with ruff/isort rules.

balaraj74 and others added 6 commits October 24, 2025 11:24

Merge pull request #1 from balaraj74/feature/median-two-sorted-arrays

fd492e1

Add Median of Two Sorted Arrays Algorithm to Divide and Conquer Section

fix: improve padding calculation for small bounding boxes

258e99c

Use adaptive padding based on box size to handle small bounding boxes better.

[pre-commit.ci] auto fixes from pre-commit.com hooks

0de4b1b

for more information, see https://pre-commit.ci

fix: sort imports to pass ruff checks

0fab683

Sort imports alphabetically (typing before numpy) to comply with ruff/isort rules.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

Add Grounded SAM2 Interactive Image Segmentation to Computer Vision#13791

Add Grounded SAM2 Interactive Image Segmentation to Computer Vision#13791
balaraj74 wants to merge 6 commits intoTheAlgorithms:masterfrom
balaraj74:feature/grounded-sam2-segmentation

balaraj74 commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Comments

Conversation

balaraj74 commented Oct 28, 2025

🎯 What I Did

Quick Overview

📂 What's Included

🔧 Implementation Details

Class: GroundedSAM2Segmenter

Design Philosophy:

✅ Testing & Validation

Doctests: 31 tests, 100% passing ✨

Demonstration Output:

📚 Why This Matters

📋 Contribution Checklist

Describe your change:

Requirements Met:

🔗 References

🙏 Acknowledgments

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Class: `GroundedSAM2Segmenter`