-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[spark] supports converting some SparkPredicate to Paimon between LeafPredicate #7265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
This PR also rewrite data filters in PaimonBatchScanBuilder, trying to merge candidates filters into |
| import java.util | ||
| import java.util.Objects | ||
|
|
||
| object PredicateUtils { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have a PredicateRewrite in core. Maybe invoked in ReadBuilder.withFilter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JingsongLi Thanks for your advise! I've refactored it. But it's worthy to note that implementing this logic in core is much more complicated than in spark. We have to consider broader scenarios, for example:
- recursive: OR(AND(a >= 1, a <= 10, a is not null), b > 10, ... )
- chained AND: AND(a >= 1, AND(a <= 10, b > 10)) (this could be converted to BETWEEN(a, 1, 10))
and more. maybe I've missed some scenario.
Please take a look and i'm pleasant to improve my code and fix potential bugs.
b7d5af4 to
1a0e215
Compare
1a0e215 to
992ef72
Compare
0736498 to
01a9d74
Compare
Purpose
Currently, spark will convert Between predicate to the composition of lessOrEqual and GreaterOrEqual, this PR is about to recognize this pattern, converting some And CompoundedPredicate to a single Between LeafPredicate.
Linked issue: none
Tests
Please see
org.apache.paimon.spark.sql.SparkV2FilterConverterTestBaseAPI and Format
No changes.
Documentation
No changes.
Generative AI tooling
This PR is fully hand-written.