|
| 1 | +<!-- |
| 2 | +Licensed to the Apache Software Foundation (ASF) under one |
| 3 | +or more contributor license agreements. See the NOTICE file |
| 4 | +distributed with this work for additional information |
| 5 | +regarding copyright ownership. The ASF licenses this file |
| 6 | +to you under the Apache License, Version 2.0 (the |
| 7 | +"License"); you may not use this file except in compliance |
| 8 | +with the License. You may obtain a copy of the License at |
| 9 | +
|
| 10 | + http://www.apache.org/licenses/LICENSE-2.0 |
| 11 | +
|
| 12 | +Unless required by applicable law or agreed to in writing, |
| 13 | +software distributed under the License is distributed on an |
| 14 | +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| 15 | +KIND, either express or implied. See the License for the |
| 16 | +specific language governing permissions and limitations |
| 17 | +under the License. |
| 18 | +--> |
| 19 | + |
| 20 | +# DataFusion Comet 0.11.0 Changelog |
| 21 | + |
| 22 | +This release consists of 131 commits from 15 contributors. See credits at the end of this changelog for more information. |
| 23 | + |
| 24 | +**Fixed bugs:** |
| 25 | + |
| 26 | +- fix: temporarily ignore test for hdfs file systems [#2359](https://github.com/apache/datafusion-comet/pull/2359) (parthchandra) |
| 27 | +- fix: Check reused broadcast plan in non-AQE and make setNumPartitions thread safe [#2398](https://github.com/apache/datafusion-comet/pull/2398) (wForget) |
| 28 | +- fix: correct `missingInput` for `CometHashAggregateExec` [#2409](https://github.com/apache/datafusion-comet/pull/2409) (comphead) |
| 29 | +- fix:clippy errros rust 1.9.0 update [#2419](https://github.com/apache/datafusion-comet/pull/2419) (coderfender) |
| 30 | +- fix: Avoid spark plan execution cache preventing CometBatchRDD numPartitions change [#2420](https://github.com/apache/datafusion-comet/pull/2420) (wForget) |
| 31 | +- fix: regressions in `CometToPrettyStringSuite` [#2384](https://github.com/apache/datafusion-comet/pull/2384) (hsiang-c) |
| 32 | +- fix: Byte array Literals failed on cast [#2432](https://github.com/apache/datafusion-comet/pull/2432) (comphead) |
| 33 | +- fix: Do not push down subquery filters on native_datafusion scan [#2438](https://github.com/apache/datafusion-comet/pull/2438) (wForget) |
| 34 | +- fix: Improve error handling when resolving S3 bucket region [#2440](https://github.com/apache/datafusion-comet/pull/2440) (andygrove) |
| 35 | +- fix: [iceberg] additional parquet independent api for iceberg integration [#2442](https://github.com/apache/datafusion-comet/pull/2442) (parthchandra) |
| 36 | +- fix: Specify reqwest crate features [#2446](https://github.com/apache/datafusion-comet/pull/2446) (andygrove) |
| 37 | +- fix: distributed RangePartitioning bounds calculation with native shuffle [#2258](https://github.com/apache/datafusion-comet/pull/2258) (mbutrovich) |
| 38 | +- fix: fix regression in tpcbench.py [#2512](https://github.com/apache/datafusion-comet/pull/2512) (andygrove) |
| 39 | +- fix: [iceberg] Close reader instance in ReadConf [#2510](https://github.com/apache/datafusion-comet/pull/2510) (hsiang-c) |
| 40 | +- fix: Enable plan stability tests for `auto` scan [#2516](https://github.com/apache/datafusion-comet/pull/2516) (andygrove) |
| 41 | +- fix: Capture unexpected output when retrieving JVM 17 args in Makefile [#2566](https://github.com/apache/datafusion-comet/pull/2566) (zuston) |
| 42 | + |
| 43 | +**Performance related:** |
| 44 | + |
| 45 | +- perf: New Configuration from shared conf to avoid high costs [#2402](https://github.com/apache/datafusion-comet/pull/2402) (wForget) |
| 46 | +- perf: Use DataFusion's `count_udaf` instead of `SUM(IF(expr IS NOT NULL, 1, 0))` [#2407](https://github.com/apache/datafusion-comet/pull/2407) (andygrove) |
| 47 | +- perf: Improve BroadcastExchangeExec conversion [#2417](https://github.com/apache/datafusion-comet/pull/2417) (wForget) |
| 48 | + |
| 49 | +**Implemented enhancements:** |
| 50 | + |
| 51 | +- feat: Add dynamic `enabled` and `allowIncompat` configs for all supported expressions [#2329](https://github.com/apache/datafusion-comet/pull/2329) (andygrove) |
| 52 | +- feat: feature specific tests [#2372](https://github.com/apache/datafusion-comet/pull/2372) (parthchandra) |
| 53 | +- feat: Support more date part expressions [#2316](https://github.com/apache/datafusion-comet/pull/2316) (wForget) |
| 54 | +- feat: rpad support column for second arg instead of just literal [#2099](https://github.com/apache/datafusion-comet/pull/2099) (coderfender) |
| 55 | +- feat: Support comet native log level conf [#2379](https://github.com/apache/datafusion-comet/pull/2379) (wForget) |
| 56 | +- feat: Enable WeekDay function [#2411](https://github.com/apache/datafusion-comet/pull/2411) (wForget) |
| 57 | +- feat: Add nested Array literal support [#2181](https://github.com/apache/datafusion-comet/pull/2181) (comphead) |
| 58 | +- feat:add_additional_char_support_rpad [#2436](https://github.com/apache/datafusion-comet/pull/2436) (coderfender) |
| 59 | +- feat: do not fallback to Spark for `COUNT(distinct)` [#2429](https://github.com/apache/datafusion-comet/pull/2429) (comphead) |
| 60 | +- feat: implement_ansi_eval_mode_arithmetic [#2136](https://github.com/apache/datafusion-comet/pull/2136) (coderfender) |
| 61 | +- feat: Add plan conversion statistics to extended explain info [#2412](https://github.com/apache/datafusion-comet/pull/2412) (andygrove) |
| 62 | +- feat: implement_comet_native_lpad_expr [#2102](https://github.com/apache/datafusion-comet/pull/2102) (coderfender) |
| 63 | +- feat: Add `backtrace` feature to simplify enabling native backtraces in `CometNativeException` [#2515](https://github.com/apache/datafusion-comet/pull/2515) (andygrove) |
| 64 | +- feat: Support reverse function with ArrayType input [#2481](https://github.com/apache/datafusion-comet/pull/2481) (cfmcgrady) |
| 65 | +- feat: Change default off-heap memory pool from `greedy_unified` to `fair_unified` [#2526](https://github.com/apache/datafusion-comet/pull/2526) (andygrove) |
| 66 | +- feat: Make DiskManager `max_temp_directory_size` configurable [#2479](https://github.com/apache/datafusion-comet/pull/2479) (manuzhang) |
| 67 | +- feat: Parquet Modular Encryption with Spark KMS for native readers [#2447](https://github.com/apache/datafusion-comet/pull/2447) (mbutrovich) |
| 68 | +- feat: Add support for Spark-compatible cast from integral to decimal [#2472](https://github.com/apache/datafusion-comet/pull/2472) (coderfender) |
| 69 | +- feat:Support ANSI mode integral divide [#2421](https://github.com/apache/datafusion-comet/pull/2421) (coderfender) |
| 70 | +- feat: Add config to enable running Comet in onheap mode [#2554](https://github.com/apache/datafusion-comet/pull/2554) (andygrove) |
| 71 | +- feat:support ansi mode rounding function [#2542](https://github.com/apache/datafusion-comet/pull/2542) (coderfender) |
| 72 | +- feat:support ansi mode remainder function [#2556](https://github.com/apache/datafusion-comet/pull/2556) (coderfender) |
| 73 | +- feat: Implement array-to-string cast support [#2425](https://github.com/apache/datafusion-comet/pull/2425) (cfmcgrady) |
| 74 | +- feat: Various improvements to memory pool configuration, logging, and documentation [#2538](https://github.com/apache/datafusion-comet/pull/2538) (andygrove) |
| 75 | +- feat: Enable complex types for columnar shuffle [#2573](https://github.com/apache/datafusion-comet/pull/2573) (mbutrovich) |
| 76 | +- feat: support_decimal_types_bool_cast_native_impl [#2490](https://github.com/apache/datafusion-comet/pull/2490) (coderfender) |
| 77 | +- feat: Use buf write to reduce system call on index write [#2579](https://github.com/apache/datafusion-comet/pull/2579) (zuston) |
| 78 | + |
| 79 | +**Documentation updates:** |
| 80 | + |
| 81 | +- doc: Document usage IcebergCometBatchReader.java [#2347](https://github.com/apache/datafusion-comet/pull/2347) (comphead) |
| 82 | +- docs: Add changelog for 0.10.0 release [#2361](https://github.com/apache/datafusion-comet/pull/2361) (andygrove) |
| 83 | +- docs: Fix error in docs [#2373](https://github.com/apache/datafusion-comet/pull/2373) (andygrove) |
| 84 | +- docs: Fix more comet versions in docs [#2374](https://github.com/apache/datafusion-comet/pull/2374) (andygrove) |
| 85 | +- docs: Publish 0.10.0 user guide [#2394](https://github.com/apache/datafusion-comet/pull/2394) (andygrove) |
| 86 | +- doc: macos benches doc clarifications [#2418](https://github.com/apache/datafusion-comet/pull/2418) (comphead) |
| 87 | +- docs: update configs.md after #2422 [#2428](https://github.com/apache/datafusion-comet/pull/2428) (mbutrovich) |
| 88 | +- docs: update docs and tuning guide related to native shuffle [#2487](https://github.com/apache/datafusion-comet/pull/2487) (mbutrovich) |
| 89 | +- docs: Improve EC2 benchmarking guide [#2474](https://github.com/apache/datafusion-comet/pull/2474) (andygrove) |
| 90 | +- docs: docs_update_ansi_support [#2496](https://github.com/apache/datafusion-comet/pull/2496) (coderfender) |
| 91 | +- docs:support lpad expression documentation update [#2517](https://github.com/apache/datafusion-comet/pull/2517) (coderfender) |
| 92 | +- docs: doc changes to support ANSI mode integral divide [#2570](https://github.com/apache/datafusion-comet/pull/2570) (coderfender) |
| 93 | +- docs: Split configuration guide into different sections (scan, exec, shuffle, etc) [#2568](https://github.com/apache/datafusion-comet/pull/2568) (andygrove) |
| 94 | +- docs: doc update to support ANSI mode remainder function [#2576](https://github.com/apache/datafusion-comet/pull/2576) (coderfender) |
| 95 | +- docs: Documentation updates [#2581](https://github.com/apache/datafusion-comet/pull/2581) (andygrove) |
| 96 | + |
| 97 | +**Other:** |
| 98 | + |
| 99 | +- chore(deps): bump uuid from 1.18.0 to 1.18.1 in /native [#2336](https://github.com/apache/datafusion-comet/pull/2336) (dependabot[bot]) |
| 100 | +- build: Check that all Scala test suites run in PR builds [#2304](https://github.com/apache/datafusion-comet/pull/2304) (andygrove) |
| 101 | +- chore: Start 0.11.0 development [#2365](https://github.com/apache/datafusion-comet/pull/2365) (andygrove) |
| 102 | +- chore: Split expression serde hash map into separate categories [#2322](https://github.com/apache/datafusion-comet/pull/2322) (andygrove) |
| 103 | +- chore: exclude Iceberg diffs from rat checks [#2376](https://github.com/apache/datafusion-comet/pull/2376) (hsiang-c) |
| 104 | +- chore: Refactor UnaryMinus serde [#2378](https://github.com/apache/datafusion-comet/pull/2378) (andygrove) |
| 105 | +- chore: Revert "chore: [1941-Part1]: Introduce `map_sort` scalar function (#2… [#2381](https://github.com/apache/datafusion-comet/pull/2381) (comphead) |
| 106 | +- chore: Refactor Literal serde [#2377](https://github.com/apache/datafusion-comet/pull/2377) (andygrove) |
| 107 | +- chore: Output `BaseAggregateExec` accurate unsupported names [#2383](https://github.com/apache/datafusion-comet/pull/2383) (comphead) |
| 108 | +- chore: Improve Initcap test and docs [#2387](https://github.com/apache/datafusion-comet/pull/2387) (andygrove) |
| 109 | +- build: fix build of 'hdfs-opendal' feature for MacOS [#2392](https://github.com/apache/datafusion-comet/pull/2392) (parthchandra) |
| 110 | +- chore(deps): bump cc from 1.2.36 to 1.2.37 in /native [#2399](https://github.com/apache/datafusion-comet/pull/2399) (dependabot[bot]) |
| 111 | +- chore: [iceberg] support Iceberg 1.9.1 [#2386](https://github.com/apache/datafusion-comet/pull/2386) (hsiang-c) |
| 112 | +- minor: Add deprecation notice to `datafusion-comet-spark-expr` crate [#2405](https://github.com/apache/datafusion-comet/pull/2405) (andygrove) |
| 113 | +- minor: Update benchmarking scripts to specify scan implementation [#2403](https://github.com/apache/datafusion-comet/pull/2403) (andygrove) |
| 114 | +- refactor: Scala hygiene - remove `scala.collection.JavaConverters` [#2393](https://github.com/apache/datafusion-comet/pull/2393) (hsiang-c) |
| 115 | +- chore: Improve test coverage for `count` aggregates [#2406](https://github.com/apache/datafusion-comet/pull/2406) (andygrove) |
| 116 | +- chore: upgrade to DataFusion 50.0.0, Arrow 56.1.0, Parquet 56.0.0 among others [#2286](https://github.com/apache/datafusion-comet/pull/2286) (mbutrovich) |
| 117 | +- chore: Support Spark 4.0.1 instead of 4.0.0 [#2414](https://github.com/apache/datafusion-comet/pull/2414) (andygrove) |
| 118 | +- chore: Respect native features env for cargo commands [#2296](https://github.com/apache/datafusion-comet/pull/2296) (wForget) |
| 119 | +- minor: Update TPC-DS microbenchmarks to remove "scan only" and "exec only" runs [#2396](https://github.com/apache/datafusion-comet/pull/2396) (andygrove) |
| 120 | +- minor: Add RDDScan to default value of sparkToColumnar.supportedOperatorList [#2422](https://github.com/apache/datafusion-comet/pull/2422) (wForget) |
| 121 | +- chore: new TPC-DS golden plans [#2426](https://github.com/apache/datafusion-comet/pull/2426) (mbutrovich) |
| 122 | +- chore: fix `pr_build*.yml` [#2434](https://github.com/apache/datafusion-comet/pull/2434) (comphead) |
| 123 | +- chore: Remove unused class [#2437](https://github.com/apache/datafusion-comet/pull/2437) (wForget) |
| 124 | +- chore(deps): bump cc from 1.2.37 to 1.2.38 in /native [#2439](https://github.com/apache/datafusion-comet/pull/2439) (dependabot[bot]) |
| 125 | +- chore: add validate_workflows.yml [#2441](https://github.com/apache/datafusion-comet/pull/2441) (comphead) |
| 126 | +- test: potential native broadcast failure in scenarios with ReusedExhange [#2167](https://github.com/apache/datafusion-comet/pull/2167) (akupchinskiy) |
| 127 | +- chore: Improvements of fallback info [#2450](https://github.com/apache/datafusion-comet/pull/2450) (wForget) |
| 128 | +- chore: Upgrade Apache Release Audit Tool (RAT) to 0.16.1 [#2451](https://github.com/apache/datafusion-comet/pull/2451) (andygrove) |
| 129 | +- minor: Remove reference to SortExec deadlock issue that is now resolved [#2464](https://github.com/apache/datafusion-comet/pull/2464) (andygrove) |
| 130 | +- chore: Use checked operations when growing or shrinking unified memory pool [#2455](https://github.com/apache/datafusion-comet/pull/2455) (andygrove) |
| 131 | +- minor: Improve the log message of `CometTestBase#checkCometOperators` [#2458](https://github.com/apache/datafusion-comet/pull/2458) (cfmcgrady) |
| 132 | +- minor: Skip calculating per-task memory limit when in off-heap mode [#2462](https://github.com/apache/datafusion-comet/pull/2462) (andygrove) |
| 133 | +- Chore: Used DataFusion impl of bit_get function [#2466](https://github.com/apache/datafusion-comet/pull/2466) (kazantsev-maksim) |
| 134 | +- chore(deps): bump regex from 1.11.2 to 1.11.3 in /native [#2483](https://github.com/apache/datafusion-comet/pull/2483) (dependabot[bot]) |
| 135 | +- chore: update TPS-DS plans after #2429 [#2486](https://github.com/apache/datafusion-comet/pull/2486) (mbutrovich) |
| 136 | +- chore(deps): bump thiserror from 2.0.16 to 2.0.17 in /native [#2485](https://github.com/apache/datafusion-comet/pull/2485) (dependabot[bot]) |
| 137 | +- chore(deps): bump cc from 1.2.38 to 1.2.39 in /native [#2484](https://github.com/apache/datafusion-comet/pull/2484) (dependabot[bot]) |
| 138 | +- chore: Support running specific benchmark query [#2491](https://github.com/apache/datafusion-comet/pull/2491) (comphead) |
| 139 | +- chore: Make CometColumnarToRowExec extends CometPlan [#2460](https://github.com/apache/datafusion-comet/pull/2460) (wForget) |
| 140 | +- chore: Update artifacts to 0.10.0 [#2500](https://github.com/apache/datafusion-comet/pull/2500) (comphead) |
| 141 | +- build: Stop caching libcomet in CI [#2498](https://github.com/apache/datafusion-comet/pull/2498) (andygrove) |
| 142 | +- chore: Upgrade Maven plugins [#2494](https://github.com/apache/datafusion-comet/pull/2494) (andygrove) |
| 143 | +- Chore: Used DataFusion impl of date_add and date_sub functions [#2473](https://github.com/apache/datafusion-comet/pull/2473) (kazantsev-maksim) |
| 144 | +- minor: include taskAttemptId in log messages [#2467](https://github.com/apache/datafusion-comet/pull/2467) (andygrove) |
| 145 | +- chore: Improve test assertions in plan stability suite [#2505](https://github.com/apache/datafusion-comet/pull/2505) (andygrove) |
| 146 | +- build: Add Spark 4.0 to release build script [#2514](https://github.com/apache/datafusion-comet/pull/2514) (parthchandra) |
| 147 | +- chore: Enable plan stability tests for `native_iceberg_compat` [#2519](https://github.com/apache/datafusion-comet/pull/2519) (andygrove) |
| 148 | +- chore(deps): bump parking_lot from 0.12.4 to 0.12.5 in /native [#2530](https://github.com/apache/datafusion-comet/pull/2530) (dependabot[bot]) |
| 149 | +- chore(deps): bump cc from 1.2.39 to 1.2.40 in /native [#2529](https://github.com/apache/datafusion-comet/pull/2529) (dependabot[bot]) |
| 150 | +- chore: Refactor serde for `ArrayCompact` and `ArrayFilter` [#2536](https://github.com/apache/datafusion-comet/pull/2536) (andygrove) |
| 151 | +- Chore: Fix Scala code warnings - common module [#2527](https://github.com/apache/datafusion-comet/pull/2527) (andy-hf-kwok) |
| 152 | +- chore: Refactor serde for `CheckOverflow` [#2537](https://github.com/apache/datafusion-comet/pull/2537) (andygrove) |
| 153 | +- build: Run scala tests against release build of native code [#2541](https://github.com/apache/datafusion-comet/pull/2541) (andygrove) |
| 154 | +- chore: Pass Comet configs to native `createPlan` [#2543](https://github.com/apache/datafusion-comet/pull/2543) (andygrove) |
| 155 | +- chore: Refactor serde for Length [#2547](https://github.com/apache/datafusion-comet/pull/2547) (andygrove) |
| 156 | +- chore: Include spark shim sources for spotless plugin and reformat [#2557](https://github.com/apache/datafusion-comet/pull/2557) (wForget) |
| 157 | +- chore(deps): bump opendal from 0.54.0 to 0.54.1 in /native [#2559](https://github.com/apache/datafusion-comet/pull/2559) (dependabot[bot]) |
| 158 | +- chore: Finish moving Cast serde out of QueryPlanSerde [#2550](https://github.com/apache/datafusion-comet/pull/2550) (andygrove) |
| 159 | +- chore: Use cargo-nextest in CI [#2546](https://github.com/apache/datafusion-comet/pull/2546) (andygrove) |
| 160 | +- chore: Delete unused code [#2565](https://github.com/apache/datafusion-comet/pull/2565) (zuston) |
| 161 | +- chore: Improve plan comet transformation log [#2564](https://github.com/apache/datafusion-comet/pull/2564) (wForget) |
| 162 | +- chore(deps): bump cc from 1.2.40 to 1.2.41 in /native [#2560](https://github.com/apache/datafusion-comet/pull/2560) (dependabot[bot]) |
| 163 | +- chore(deps): bump aws-credential-types from 1.2.6 to 1.2.7 in /native [#2563](https://github.com/apache/datafusion-comet/pull/2563) (dependabot[bot]) |
| 164 | +- chore: Refactor serde for RegExpReplace [#2548](https://github.com/apache/datafusion-comet/pull/2548) (andygrove) |
| 165 | +- chore: use polymorphic map builders in shuffle. [#2571](https://github.com/apache/datafusion-comet/pull/2571) (ashdnazg) |
| 166 | +- chore: Move ToPrettyString serde into shim layer [#2549](https://github.com/apache/datafusion-comet/pull/2549) (andygrove) |
| 167 | +- chore(deps): bump DataFusion dependencies to 50.2.0, refresh Cargo.lock [#2575](https://github.com/apache/datafusion-comet/pull/2575) (mbutrovich) |
| 168 | + |
| 169 | +## Credits |
| 170 | + |
| 171 | +Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor. |
| 172 | + |
| 173 | +``` |
| 174 | + 47 Andy Grove |
| 175 | + 15 Zhen Wang |
| 176 | + 14 B Vadlamani |
| 177 | + 12 Oleks V |
| 178 | + 11 dependabot[bot] |
| 179 | + 10 Matt Butrovich |
| 180 | + 5 Parth Chandra |
| 181 | + 5 hsiang-c |
| 182 | + 3 Fu Chen |
| 183 | + 3 Junfan Zhang |
| 184 | + 2 Kazantsev Maksim |
| 185 | + 1 Artem Kupchinskiy |
| 186 | + 1 Eshed Schacham |
| 187 | + 1 Manu Zhang |
| 188 | + 1 andy-hf-kwok |
| 189 | +``` |
| 190 | + |
| 191 | +Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release. |
| 192 | + |
0 commit comments