test 1 #1

ding-young · 2025-07-05T12:57:26Z

Which issue does this PR close?

Closes #.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

…pache#16019) * draft commit to rolledback changes on function naming and include prepare clause on the infer types tests * include data types in plan when it is not included in the prepare statement * fix: prepare statement error * Update datafusion/sql/src/statement.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * remove infer types from prepare statement the infer data type changes in statement will be introduced in a new PR * fix to show correct output message * include data types on logical plans of prepare statements without explicit type declaration * fix using clippy sugestions * explicitly get the data types using the placeholder id to avoid sorting * Restore the original tests too * update set data type routine to be more rust idiomatic Co-authored-by: Tommy shu <qstommyshu@gmail.com> * update set datatype routine * fix formatting in sql_integration --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Tommy shu <qstommyshu@gmail.com>

…ache#16119) * minor fixes to arch docs Co-authored-by: Oleks V <comphead@users.noreply.github.com> --------- Co-authored-by: Oleks V <comphead@users.noreply.github.com>

add snapshot tests for memory exhaustion

…s & nested Column expressions in maybe_fix_physical_column_name (apache#16064) * Fix union schema name coercion * Address renaming for columns that are not in the top level as well * Add unit test * Format * Use insta tests properly * Address review - comment + minor simplification change --------- Co-authored-by: Berkay Şahin <124376117+berkaysynnada@users.noreply.github.com>

…6071) * initial Iteration * add Sql Logic tests * tweak comments * unify data, structure tests * Deleted by mistake

* Move prepare/parameter handling tests into `params.rs` * Resolve conflicts

…pache#16029) * Support filtering specific sqllogictests identified by line number * Add license header * Try parsing in different dialects * Add test filtering example to README.md * Improve Filter doc comment * Factor out statement_is_skippable into its own function * Add example about how filters work in the doc comments

…hausted errors (apache#16152) * Enrich GroupedHashAggregateStream name to ease debugging Resources exhausted errors * Use human_display * clippy

Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.16.0 to 1.17.0. - [Release notes](https://github.com/uuid-rs/uuid/releases) - [Commits](uuid-rs/uuid@v1.16.0...v1.17.0) --- updated-dependencies: - dependency-name: uuid dependency-version: 1.17.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Both WHERE clause and HAVING clause translate to a Filter plan node. They differ in how the references and aggregates are handled. HAVING goes after aggregation and may reference aggregate expressions and therefore HAVING's filter will be placed after Aggregation plan node. Once a plan has been built, however, there is no special additional semantics to filters created from HAVING. Remove the unnecessary field. For reference, the field was added along with usage in a50aeef commit and the usage was later removed in eb62e28 commit.

) * Clarify docs and names in parquet predicate pushdown tests * Update datafusion/datasource/src/file_scan_config.rs Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com> * clippy --------- Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>

…16175) * Fix name() for FilterPushdown physical optimizer rule Typo that wasn't caught during review... * fix

fix according to review fix to_string error fix test by stripping backtrace

…che#16138) Added `tables: HashMap<String, Arc<dyn TableSource>>` and `MyContextProvider::with_schema` method for dynamically defining tables for optimizer integration tests.

* Speedup tpch run with memtable * Clippy * Clippy

* Specialize unique join * handle splitting * rename a bit * fix * fix * fix * fix * Fix the test, add explanation * Simplify * Update datafusion/physical-plan/src/joins/join_hash_map.rs Co-authored-by: Christian <9384305+ctsk@users.noreply.github.com> * Update datafusion/physical-plan/src/joins/join_hash_map.rs Co-authored-by: Christian <9384305+ctsk@users.noreply.github.com> * Simplify * Simplify * Simplify --------- Co-authored-by: Christian <9384305+ctsk@users.noreply.github.com>

…e#16079) * added test * added parameterTest * cargo fmt * Update sql_integration.rs * allow needless_lifetimes * remove needless lifetime * update some tests * move to params.rs

* feat: array_length for fixed size list * remove list view

…tion` (apache#16164)

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.45.0 to 1.45.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](tokio-rs/tokio@tokio-1.45.0...tokio-1.45.1) --- updated-dependencies: - dependency-name: tokio dependency-version: 1.45.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…#16127) * Add failing test to demonstrate problem * Improve `unproject_sort_expr` to handle arbitrary expressions (apache#83) * Remove redundant return

Bumps [rustyline](https://github.com/kkawakam/rustyline) from 15.0.0 to 16.0.0. - [Release notes](https://github.com/kkawakam/rustyline/releases) - [Changelog](https://github.com/kkawakam/rustyline/blob/master/History.md) - [Commits](kkawakam/rustyline@v15.0.0...v16.0.0) --- updated-dependencies: - dependency-name: rustyline dependency-version: 16.0.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

ADD sha2 spark function

* Add macro for creating DataFrame (apache#16090) --------- Co-authored-by: Sergey Zhukov <szhukov@aligntech.com>

* migrate `logical_plan` tests to insta * fix assert error * fix according to review * strip backtrace from internal error * format * format * fix `format("outer_query")` * fix `Internal` error

Bumps [clap](https://github.com/clap-rs/clap) from 4.5.38 to 4.5.39. - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](clap-rs/clap@clap_complete-v4.5.38...clap_complete-v4.5.39) --- updated-dependencies: - dependency-name: clap dependency-version: 4.5.39 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

See: - apache#14141 - apache/datafusion-sqlparser-rs#1909

… sensitive to expected non determinism (apache#16501)

* Add support for Arrow Time types in Substrait This commit adds support for Arrow Time types Time32 and Time64 in Substrait plans. Resolves apache#16296 Resolves apache#16275 * Clean up test

…16610) * fix: support scalar function nested in get_field * update * update test * fix bug * update

Bumps [substrait](https://github.com/substrait-io/substrait-rs) from 0.57.0 to 0.58.0. - [Release notes](https://github.com/substrait-io/substrait-rs/releases) - [Changelog](https://github.com/substrait-io/substrait-rs/blob/main/CHANGELOG.md) - [Commits](substrait-io/substrait-rs@v0.57.0...v0.58.0) --- updated-dependencies: - dependency-name: substrait dependency-version: 0.58.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Support explain tree format debug for benchmark debug * fmt * format * Address comments * doc fix

* Add microbenchmark for spilling with compression * add wide batch * make num_rows configurable * calculate write/read throughput

…n scan (apache#16646) * respect parquet filter pushdown config in scan * Add test

Bumps [aws-config](https://github.com/smithy-lang/smithy-rs) from 1.8.0 to 1.8.1. - [Release notes](https://github.com/smithy-lang/smithy-rs/releases) - [Changelog](https://github.com/smithy-lang/smithy-rs/blob/main/CHANGELOG.md) - [Commits](https://github.com/smithy-lang/smithy-rs/commits) --- updated-dependencies: - dependency-name: aws-config dependency-version: 1.8.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* feat: replace snapshot tests for enforce_sorting * feat: modify assert_optimized macro to test one snapshot with a combined physical plan * feat: update assert_optimized to support snapshot testing * Revert "feat: replace snapshot tests for enforce_sorting" This reverts commit 8c921fa. * feat: migrate core test to insta * fix format * fix format * fix typo * refactor: rename function * fix: remove trimming * refactor: replace get_plan_string with displayable in projection_pushdown --------- Co-authored-by: Cheng-Yuan-Lai <a186235@g,ail.com> Co-authored-by: Ian Lai <Ian.Lai@senao.com>

Run `cargo test --test sqllogictests -- --complete` and commit the results.

* Add PhysicalExpr optimizer and cast unwrapping * address pr feedback * Update datafusion/pruning/src/pruning_predicate.rs * more lit(Xi64)

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.45.1 to 1.46.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](tokio-rs/tokio@tokio-1.45.1...tokio-1.46.0) --- updated-dependencies: - dependency-name: tokio dependency-version: 1.46.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…pt limit pushdown (apache#16641) Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>

…16615) * Convert Option<Vec<sort expression>> to Vec<sort expression> * clippy * fix comment * fix doc * change back to Expr * remove redundant check

) * Improve error message when ScalarValue fails to cast array The `as_*_array` functions and the `downcast_value!` have the benefit of reporting the array type when there is a mismatch. This makes the error message more actionable. * test

* Add an example of embedding indexes inside a parquet file * Add page image * Add prune file example * Fix clippy * polish code * Fmt * address comments * Add debug * Add new example, but it will fail with page index * add debug * add debug * polish * debug * Using low level API to support * polish * fix * merge * fix * complte solution * polish comments * adjust image * add comments part 1 * pin to new arrow-rs * pin to new arrow-rs * add comments part 2 * merge upstream * merge upstream * polish code * Rename example and add it to the list * Work on comments * More documentation * Documentation obession, encapsulate example * Update datafusion-examples/examples/parquet_embedded_index.rs Co-authored-by: Sherin Jacob <jacob@protoship.io> --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Sherin Jacob <jacob@protoship.io>

* Implementation for regex_instr * linting and typo addressed in bench * prettier formatting * scalar_functions_formatting * linting format macros * formatting * address comments to PR * formatting * clippy * fmt * address docs typo * remove unnecessary struct and comment * delete redundant lines add tests for subexp correct function signature for benches * refactor get_index * comments addressed * update doc * clippy upgrade --------- Co-authored-by: Nirnay Roy <nirnayroy1012@gmail.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Dmitrii Blaginin <dmitrii@blaginin.me>

…nts (apache#16672) - Refactored the `DataFusionError` enum to use `Box<T>` for: - `ArrowError` - `ParquetError` - `AvroError` - `object_store::Error` - `ParserError` - `SchemaError` - `JoinError` - Updated all relevant match arms and constructors to handle boxed errors. - Refactored error-related macros (`arrow_datafusion_err!`, `sql_datafusion_err!`, etc.) to use `Box<T>`. - Adjusted test cases and error assertions for boxed variants. - Documentation update to the upgrade guide to explain the required changes and rationale.

…on and Mapping (apache#16583) - Introduced a new `schema_adapter_factory` field in `ListingTableConfig` and `ListingTable` - Added `with_schema_adapter_factory` and `schema_adapter_factory()` methods to both structs - Modified execution planning logic to apply schema adapters during scanning - Updated statistics collection to use mapped schemas - Implemented detailed documentation and example usage in doc comments - Added new unit and integration tests validating schema adapter behavior and error cases

* Reuse Rows in RowCursorStream * WIP * Fmt * Add comment, make it backwards compatible * Add comment, make it backwards compatible * Add comment, make it backwards compatible * Clippy * Clippy * Return error on non-unique reference * Comment * Update datafusion/physical-plan/src/sorts/stream.rs Co-authored-by: Oleks V <comphead@users.noreply.github.com> * Fix * Extract logic * Doc fix --------- Co-authored-by: Oleks V <comphead@users.noreply.github.com>

apache#16630) * Perf: fast CursorValues compare for StringViewArray using inline_key_fast * fix * polish * polish * add test --------- Co-authored-by: Daniël Heres <danielheres@gmail.com>

One step towards apache#16652. Co-authored-by: Oleks V <comphead@users.noreply.github.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Daniël Heres <danielheres@gmail.com>

github-actions · 2025-09-04T02:41:02Z

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

brayanjuls and others added 30 commits May 21, 2025 17:28

docs: Fix typos and minor grammatical issues in Architecture docs (ap…

39063f6

…ache#16119) * minor fixes to arch docs Co-authored-by: Oleks V <comphead@users.noreply.github.com> --------- Co-authored-by: Oleks V <comphead@users.noreply.github.com>

add top-memory-consumers option in cli (apache#16081)

cb45f1f

add snapshot tests for memory exhaustion

fix ci extended test (apache#16144)

67a2173

adding support for Min/Max over LargeList and FixedSizeList (apache#1…

5293b70

…6071) * initial Iteration * add Sql Logic tests * tweak comments * unify data, structure tests * Deleted by mistake

Move prepare/parameter handling tests into params.rs (apache#16141)

dc8161e

* Move prepare/parameter handling tests into `params.rs` * Resolve conflicts

Add StateFieldsArgs::return_field (apache#16112)

ce835da

Enrich GroupedHashAggregateStream name to ease debugging Resources ex…

e305353

…hausted errors (apache#16152) * Enrich GroupedHashAggregateStream name to ease debugging Resources exhausted errors * Use human_display * clippy

Minor: Fix links in substrait readme (apache#16156)

2afa3aa

Minor: Fix name() for FilterPushdown physical optimizer rule (apache#…

d4218fd

…16175) * Fix name() for FilterPushdown physical optimizer rule Typo that wasn't caught during review... * fix

migrate tests in pool.rs to use insta (apache#16145)

2add813

fix according to review fix to_string error fix test by stripping backtrace

refactor(optimizer): add .with_schema for defining test tables (apa…

af67caa

…che#16138) Added `tables: HashMap<String, Arc<dyn TableSource>>` and `MyContextProvider::with_schema` method for dynamically defining tables for optimizer integration tests.

[Minor] Speedup TPC-H benchmark run with memtable option (apache#16159)

dacdda2

* Speedup tpch run with memtable * Clippy * Clippy

chore: Reduce repetition in the parameter type inference tests (apach…

3b551e9

…e#16079) * added test * added parameterTest * cargo fmt * Update sql_integration.rs * allow needless_lifetimes * remove needless lifetime * update some tests * move to params.rs

feat: array_length for fixed size list (apache#16167)

605ccbd

* feat: array_length for fixed size list * remove list view

fix: remove trailing whitespace in Display for `LogicalPlan::Projec…

c5df6ee

…tion` (apache#16164)

Improve unproject_sort_expr to handle arbitrary expressions (apache…

16c7939

…#16127) * Add failing test to demonstrate problem * Improve `unproject_sort_expr` to handle arbitrary expressions (apache#83) * Remove redundant return

feat: ADD sha2 spark function (apache#16168)

260a28a

ADD sha2 spark function

Add macro for creating DataFrame (apache#16090) (apache#16104)

db0ab74

* Add macro for creating DataFrame (apache#16090) --------- Co-authored-by: Sergey Zhukov <szhukov@aligntech.com>

migrate logical_plan tests to insta (apache#16184)

68e26f1

* migrate `logical_plan` tests to insta * fix assert error * fix according to review * strip backtrace from internal error * format * format * fix `format("outer_query")` * fix `Internal` error

doc: Move dataframe! example into dedicated example (apache#16197)

aaae4d7

crepererum and others added 27 commits July 1, 2025 12:38

fix: reserved keywords in qualified column names (apache#16584)

e75eb7f

See: - apache#14141 - apache/datafusion-sqlparser-rs#1909

restore topk pre-filtering of batches and make sort query fuzzer less…

9bb309c

… sensitive to expected non determinism (apache#16501)

Add support for Arrow Time types in Substrait (apache#16558)

92f646c

* Add support for Arrow Time types in Substrait This commit adds support for Arrow Time types Time32 and Time64 in Substrait plans. Resolves apache#16296 Resolves apache#16275 * Clean up test

fix: support scalar function nested in get_field in Unparser (apache#…

17f1c9d

…16610) * fix: support scalar function nested in get_field * update * update test * fix bug * update

Support explain tree format debug for benchmark debug (apache#16604)

de79843

* Support explain tree format debug for benchmark debug * fmt * format * Address comments * doc fix

Add microbenchmark for spilling with compression (apache#16512)

25c2a07

* Add microbenchmark for spilling with compression * add wide batch * make num_rows configurable * calculate write/read throughput

Fix parquet filter_pushdown: respect parquet filter pushdown config i…

f03a8fd

…n scan (apache#16646) * respect parquet filter pushdown config in scan * Add test

Update all spark SLT files (apache#16637)

705ea42

Run `cargo test --test sqllogictests -- --complete` and commit the results.

Add PhysicalExpr optimizer and cast unwrapping (apache#16530)

6870cc1

* Add PhysicalExpr optimizer and cast unwrapping * address pr feedback * Update datafusion/pruning/src/pruning_predicate.rs * more lit(Xi64)

benchmark: Support sort_tpch10 for benchmark (apache#16671)

3ca09a6

Fix TopK Sort incorrectly pushed down past operators that do not acce…

06e5bbe

…pt limit pushdown (apache#16641) Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>

Convert Option<Vec<sort expression>> to Vec<sort expression> (apache#…

50dc83a

…16615) * Convert Option<Vec<sort expression>> to Vec<sort expression> * clippy * fix comment * fix doc * change back to Expr * remove redundant check

datafusion-cli: Refactor statement execution logic (apache#16634)

1cc67ab

Perf: fast CursorValues compare for StringViewArray using inline_key_… (

0185da6

apache#16630) * Perf: fast CursorValues compare for StringViewArray using inline_key_fast * fix * polish * polish * add test --------- Co-authored-by: Daniël Heres <danielheres@gmail.com>

refactor: shrink SchemaError (apache#16653)

a715173

One step towards apache#16652. Co-authored-by: Oleks V <comphead@users.noreply.github.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

rustup version (apache#16663)

aadb79b

Co-authored-by: Daniël Heres <danielheres@gmail.com>

test how merge commit look

c9a93c9

github-actions bot added the Stale label Sep 4, 2025

github-actions bot closed this Sep 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test 1 #1

test 1 #1

Uh oh!

ding-young commented Jul 5, 2025

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

test 1 #1

test 1 #1

Uh oh!

Conversation

ding-young commented Jul 5, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants