Commit 4d6f0dd
refactor: Extract sort-merge join filter logic into separate module
Refactored the sort-merge join implementation to improve code organization
by extracting all filter-related logic into a dedicated filter.rs module.
Changes:
- Created new filter.rs module (~576 lines) containing:
- Filter metadata tracking (FilterMetadata struct)
- Deferred filtering decision logic (needs_deferred_filtering)
- Filter mask correction for different join types (get_corrected_filter_mask)
- Filter application with null-joined row handling (filter_record_batch_by_join_type)
- Helper functions for filter column extraction and batch filtering
- Updated stream.rs:
- Removed ~450 lines of filter-specific code
- Now delegates to filter module functions
- Simplified main join logic to focus on stream processing
- Updated tests.rs:
- Updated imports to use new filter module
- Changed test code to use FilterMetadata struct
- All 47 sort-merge join tests passing
- Fixed null-joined batch creation for joins with different column counts:
- Correctly handles LEFT/RIGHT/FULL outer joins with asymmetric schemas
- Properly extracts and replaces columns based on join type and batch organization
- Uses RecordBatchOptions to handle nullable field validation in outer joins
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>1 parent b818f93 commit 4d6f0dd
0 commit comments