-
Notifications
You must be signed in to change notification settings - Fork 505
ORC-XXX: Support orc.compression.zstd.workers #1756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Evaluating if this has any benefits in ORC. |
|
Thanks @dongjoon-hyun for doing this, this is also what I want to introduce this configuration after zstd-jni merge, like Spark and Parquet also have similar configurations. |
|
Ya, indeed. BTW, it seems that there is no perf gain with this so far. Interesting. |
Based on the product environment verification of this PR, I tested Although Paruqet also provides options for the number of zstd workers. https://facebook.github.io/zstd/zstd_manual.html |
|
Thank you for double-check. Ya, it seems that our implementation has some limitations or bug. Apache Spark also has the ZStandardCodec implementation based on this I'm still digging this because I believe this should be a part of Apache ORC 2.0.0 |
1775f51 to
4104f7c
Compare
What changes were proposed in this pull request?
Why are the changes needed?
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?