forked from apache/datafusion
-
Notifications
You must be signed in to change notification settings - Fork 0
test 2 #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
test 2 #2
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* doc: fix indent format explain * update
…pache#16058) * Add test generated from schema in Comet. * Checkpoint DFS. * Checkpoint with working transformation. * fmt, clippy fixes. * Remove maximum stack depth. * More testing. * Improve tests. * Improve docs. * Use a smaller HashSet instead of HashMap with every field in it. More docs. * Use a smaller HashSet instead of HashMap with every field in it. More docs. * More docs. * More docs. * Fix typo. * Refactor match with nested if lets to make it more readable. * Address some PR feedback. * Rename variables in struct processing to address PR feedback. Do List next. * Rename variables in list processing to address PR feedback. * Update docs. * Simplify list parquet path generation. * Map support. * Remove old TODO. * Reduce redundant docs be referring to docs above. * Reduce redundant docs be referring to docs above. * Add parquet file generated from CometFuzzTestSuite ParquetGenerator (similar to schema in file_format tests) to exercise end-to-end support. * Fix clippy.
…pache#16100) * Update documentation for `datafusion.execution.collect_statistics` setting * Update test * Update datafusion/common/src/config.rs Co-authored-by: Leonardo Yvens <leoyvens@gmail.com> * update docs * Update doc --------- Co-authored-by: Leonardo Yvens <leoyvens@gmail.com>
* handle coercion for Float16 types * Add some basic slt tests --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* chore(deps): bump testcontainers from 0.23.3 to 0.24.0 Bumps [testcontainers](https://github.com/testcontainers/testcontainers-rs) from 0.23.3 to 0.24.0. - [Release notes](https://github.com/testcontainers/testcontainers-rs/releases) - [Changelog](https://github.com/testcontainers/testcontainers-rs/blob/main/CHANGELOG.md) - [Commits](testcontainers/testcontainers-rs@0.23.3...0.24.0) --- updated-dependencies: - dependency-name: testcontainers dependency-version: 0.24.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Update test_containers_modules too --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…ree (apache#16097) * feat: make error handling in indent consistent with that in tree * update test * return all plans instead of throwing err * update test
* Support GroupsAccumulator for avg duration * update test --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Move PruningStatistics into datafusion::common * fix doc * remove new code * fmt
* wip * comment * Update datafusion/core/src/datasource/physical_plan/parquet.rs * remove prints * better test * fmt
…fig (apache#16080) * fix * add a test * fmt * add to upgrade guide * fix tests * fix test * fix test * fix ci * Fix example in upgrade guide (apache#29) --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* feat: escape quote wrap identifiers in describe rm: dev files fmt: final formatting sed: s/<comment>// * fix: use ident instead of col + format
* Update documentation about DDL and DML * Improve the DML Documentation * Apply suggestions from code review Co-authored-by: Oleks V <comphead@users.noreply.github.com> * Fix docs * Fix docs --------- Co-authored-by: Oleks V <comphead@users.noreply.github.com>
* Optimize performance of string::ascii function d * Add benchmark with with NULL_DENSITY=0 d --------- Co-authored-by: Tai Le Manh <tailm2@vingroup.net> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* chore: Use pre created data for filter pushdown tests * chore: Use pre created data for filter pushdown tests
* chore: Upgrade `rand` crate and some other minor crates --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…pache#16019) * draft commit to rolledback changes on function naming and include prepare clause on the infer types tests * include data types in plan when it is not included in the prepare statement * fix: prepare statement error * Update datafusion/sql/src/statement.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * remove infer types from prepare statement the infer data type changes in statement will be introduced in a new PR * fix to show correct output message * include data types on logical plans of prepare statements without explicit type declaration * fix using clippy sugestions * explicitly get the data types using the placeholder id to avoid sorting * Restore the original tests too * update set data type routine to be more rust idiomatic Co-authored-by: Tommy shu <qstommyshu@gmail.com> * update set datatype routine * fix formatting in sql_integration --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Tommy shu <qstommyshu@gmail.com>
…ache#16119) * minor fixes to arch docs Co-authored-by: Oleks V <comphead@users.noreply.github.com> --------- Co-authored-by: Oleks V <comphead@users.noreply.github.com>
add snapshot tests for memory exhaustion
…s & nested Column expressions in maybe_fix_physical_column_name (apache#16064) * Fix union schema name coercion * Address renaming for columns that are not in the top level as well * Add unit test * Format * Use insta tests properly * Address review - comment + minor simplification change --------- Co-authored-by: Berkay Şahin <124376117+berkaysynnada@users.noreply.github.com>
…6071) * initial Iteration * add Sql Logic tests * tweak comments * unify data, structure tests * Deleted by mistake
* Move prepare/parameter handling tests into `params.rs` * Resolve conflicts
…pache#16029) * Support filtering specific sqllogictests identified by line number * Add license header * Try parsing in different dialects * Add test filtering example to README.md * Improve Filter doc comment * Factor out statement_is_skippable into its own function * Add example about how filters work in the doc comments
…e#16488) * feat: Finalize support for `RightMark` join * Update utils.rs * add `join_selection` tests * fmt * Update join_selection.rs --------- Co-authored-by: Oleks V <comphead@users.noreply.github.com>
Bumps [indexmap](https://github.com/indexmap-rs/indexmap) from 2.9.0 to 2.10.0. - [Changelog](https://github.com/indexmap-rs/indexmap/blob/main/RELEASES.md) - [Commits](indexmap-rs/indexmap@2.9.0...2.10.0) --- updated-dependencies: - dependency-name: indexmap dependency-version: 2.10.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…re are no dynamic filters (apache#16424)
* Column indices were not computed correctly, causing a panic * Add unit tests
…` table functions (apache#16552) * init * trait based * also support date * explain
* Fix Column mgmt when parsing USING joins.
In SqlToRel::parse_join(), when handling JoinContraint::Using, the
identifiers are normalized using IdentNormalizer::normalize().
That normalization lower-cases unquoted identifiers, and keeps the case
otherwise (but not the quotes).
Until this commit, the normalized column names were passed to
LogicalPlanBuilder::join_using() as strings. When each goes through
LogicalPlanBuilder::normalize(), Column::From<String>() is called,
leading to Column::from_qualified_named(). As it gets an unqualified
column, it lower-cases it.
This means that if a join is USING("SOME_COLUMN_NAME"), we end up with a
Column { name: "some_column_name", ..}. In the end, the join fails, as
that lower-case column does not exist.
With this commit, SqlToRel::parse_join() calls Column::from_name() on
each normalized column and passed those to
LogicalPlanBuilder::join_using(). Downstream, in
LogicalPlanBuilder::normalize(), there is no need to create the Column
objects from strings, and the bug does not happen.
This fixes apache#16120.
* Remove genericity from LogicalPlanBuilder::join_using().
Until this commit, LogicalPlanBuilder::join_using() accepted using_keys:
Vec<impl Into<Column> + Clone>.
This commit removes this, only allowing Vec<Column>.
Motivation: passing e.g. Vec<String> for using_keys is bug-prone, as the
Strings can get (their case) modified when made into Column. That logic
is admissible with a common column name that can be qualified, but some
column names cannot (e.g. USING keys).
This commit changes the API. However, potential users can trivially fix
their code by calling Column::from/from_qualified_name on their
using_keys. This forces them to things about what their identifier
represent and that removes a class of potential bugs.
Additional bonus: shorter compilation time & binary size.
---------
Co-authored-by: Bruno Cauet <bruno.cauet@qube-rt.com>
* fix: reject within_group for non ordered aggregate function * update error * support within
apache#16488)" (apache#16597) This reverts commit d73f0e8.
* Initial commit to form PR for datafusion encryption support * Add tests for encryption configuration * Apply cargo fmt * Add a roundtrip encryption test to the parquet tests. * cargo fmt * Update test to add decryption parameter to called functions. * Try to get DataFrame.write_parquet to work with encryption. Doesn't quite, column encryption is broken. * Update datafusion/datasource-parquet/src/opener.rs Co-authored-by: Adam Reeve <adreeve@gmail.com> * Update datafusion/datasource-parquet/src/source.rs Co-authored-by: Adam Reeve <adreeve@gmail.com> * Fix write test in parquet.rs * Simplify encryption test. Remove unused imports. * Run cargo fmt. * Further streamline roundtrip test. * Change From methods for FileEncryptionProperties and FileDecryptionProperties to use references. * Change encryption config to directly hold column keys using custom config fields. * Fix generated field names in visit for encryptor and decryptor to use "." instead of "::" * 1. Disable parallel writes with enccryption. 2. Fixed unused header warning in config.rs. 3. Fix test case in encryption.rs to call conversion to ConfigFileDecryption properties correctly. * cargo fmt * Update datafusion/common/src/file_options/parquet_writer.rs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix variables shown in information schema test. * Backout bad suggestion from copilot * Remove unused serde reference Add an example to read and write encrypted parquet files. * cargo fmt * change file_format.rs to use global encryption options in struct. * Turn off page_index for encrypted example. Get encrypted example working with filter. * Tidy up example output. * Add missing license. Run taplo format * Update configs.md by running dev/update_config_docs.sh * Cargo fmt + clippy changes. * Add filter test for encrypted files. * Cargo clippy changes. * Fix link in README.md * Add issue tag for parallel writes. * Move file encryption and decryption properties out of global options * Use config_namespace_with_hashmap for column encryption/decryption props * Remove outdated docs on crypto settings. Signed-off-by: Corwin Joy <corwin.joy@gmail.com> * 1. Add docs for using encryption configuration. 2. Add example SQL for using encryption from CLI. 3. Fix removed variables in test for configuration information. 4. Clippy and cargo fmt. Signed-off-by: Corwin Joy <corwin.joy@gmail.com> * Update code to add missing ParquetOpener parameter due to merge from main Signed-off-by: Corwin Joy <corwin.joy@gmail.com> * Add CLI documentation for Parquet options and provide an encryption example Signed-off-by: Corwin Joy <corwin.joy@gmail.com> * Use ConfigFileDecryptionProperties in ParquetReadOptions Signed-off-by: Adam Reeve <adam.reeve@gr-oss.io> * Implement default for ConfigFileEncryptionProperties Signed-off-by: Corwin Joy <corwin.joy@gmail.com> * Add sqllogictest for parquet with encryption Signed-off-by: Corwin Joy <corwin.joy@gmail.com> * Apply prettier changes from CI Signed-off-by: Corwin Joy <corwin.joy@gmail.com> * logical conflift * fix another logical conflict --------- Signed-off-by: Corwin Joy <corwin.joy@gmail.com> Signed-off-by: Adam Reeve <adam.reeve@gr-oss.io> Co-authored-by: Adam Reeve <adreeve@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Adam Reeve <adam.reeve@gr-oss.io> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Allow usage of table funstions in relations * Rebase
* Update to arrow/parquet 55.2.0 Update to released version * Update plans
|
Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?