Skip to content

Conversation

@StevenReitsma
Copy link
Contributor

@StevenReitsma StevenReitsma commented Dec 16, 2025

This is a continuation of #878. The query settings were missing for a couple of more queries, which is relevant for the ClickHouse platform in some situations. Hopefully this now covers all possible use-cases.

Also, there was a bug where temporary tables were being created on platforms that don't support it (like ClickHouse). This PR fixes that.

Finally, we add on_cluster_clause in ClickHouse specific code, which is necessary for clustered self-hosted ClickHouse installations.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Fixed temporary table creation in environments where they are not supported
  • Improvements

    • Enhanced cluster-aware operations for distributed database environments
    • Optimized query execution settings for data operations and tests

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link
Contributor

👋 @StevenReitsma
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

@coderabbitai
Copy link

coderabbitai bot commented Dec 16, 2025

Walkthrough

These changes enhance database macro operations by adding query settings support, implementing cluster-aware ClickHouse operations, enforcing temporary table compatibility checks, and restructuring table cleanup queries for explicit control flow across multiple SQL macros.

Changes

Cohort / File(s) Summary
Query Settings Integration
macros/edr/tests/on_run_end/handle_tests_results.sql, macros/utils/table_operations/insert_as_select.sql
Adds get_query_settings() calls to INSERT-SELECT and insert-as-select operations to augment generated statements with query settings configuration.
ClickHouse Cluster-Aware Operations
macros/utils/table_operations/delete_and_insert.sql, macros/edr/tests/test_utils/clean_elementary_test_tables.sql
Injects cluster-aware clauses (on_cluster_clause()) into delete and drop operations to ensure multi-node cluster execution; replaces delegated call with explicit DROP TABLE IF EXISTS loop.
Temporary Table Support Guard
macros/utils/table_operations/create_table_as.sql
Adds conditional check to force temporary flag to false when has_temp_table_support() returns false, preventing temporary table creation in unsupported environments.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Review focus areas:
    • Verify that get_query_settings() integration does not conflict with existing query optimization logic
    • Validate on_cluster_clause() syntax and placement in delete/drop statements
    • Confirm temporary table guard logic integrates correctly with existing environment checks
    • Test cluster operations on multi-node ClickHouse deployments

Poem

🐰 Hop, hop! The macros now dance in sync,
With queries adorned in settings so fine,
Clusters unite with a mystical link,
While temp tables bow when not blessed by design,
Elementary magic, now perfectly aligned!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add ClickHouse cluster support, part 2' is partially related to the changeset. While it accurately reflects the cluster-related changes (on_cluster_clause additions), it omits other significant changes like adding query settings and fixing temporary table creation bugs.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8d91cf6 and 3d1a83a.

📒 Files selected for processing (5)
  • macros/edr/tests/on_run_end/handle_tests_results.sql (2 hunks)
  • macros/edr/tests/test_utils/clean_elementary_test_tables.sql (1 hunks)
  • macros/utils/table_operations/create_table_as.sql (1 hunks)
  • macros/utils/table_operations/delete_and_insert.sql (1 hunks)
  • macros/utils/table_operations/insert_as_select.sql (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-22T15:41:57.338Z
Learnt from: haritamar
Repo: elementary-data/dbt-data-reliability PR: 825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.

Applied to files:

  • macros/utils/table_operations/create_table_as.sql
🔇 Additional comments (7)
macros/edr/tests/on_run_end/handle_tests_results.sql (4)

138-138: LGTM!

The addition of get_query_settings() is consistent with the pattern applied throughout this PR and properly placed in the INSERT statement.


145-147: Good defensive cleanup logic.

The explicit cleanup of temporary relations when has_temp_table_support() returns false ensures that no orphaned tables remain on platforms that don't support native temporary tables. This complements the guard added in create_table_as.sql.


187-187: LGTM!

Consistent application of get_query_settings() in the second INSERT statement.


194-196: Consistent cleanup pattern applied.

The same defensive cleanup logic is properly replicated here, ensuring consistency across both insert operations.

macros/utils/table_operations/create_table_as.sql (1)

11-13: Fix: Correct the platform examples for temporary table support limitations.

This guard correctly prevents temporary table creation on unsupported platforms. However, update the comment: ClickHouse actually supports temporary tables. The platforms that lack temporary table support are Spark, Trino, Athena, and Dremio. The placement before edr_get_create_table_as_sql ensures all adapter-specific implementations receive the corrected flag.

Likely an incorrect or invalid review comment.

macros/edr/tests/test_utils/clean_elementary_test_tables.sql (1)

38-45: Undefined macro will cause runtime failure.

The code calls on_cluster_clause(test_relation) which is not defined in this repository. This macro is provided by the dbt-clickhouse adapter, but it is not imported as a package dependency. This will result in a "Unknown macro" error at runtime when the ClickHouse adapter path is executed. Either add dbt-clickhouse as a dependency or define the macro locally.

⛔ Skipped due to learnings
Learnt from: haritamar
Repo: elementary-data/dbt-data-reliability PR: 825
File: macros/utils/table_operations/create_table_as.sql:0-0
Timestamp: 2025-07-22T15:41:57.338Z
Learning: The dbt.create_table_as macro breaks in dbt-fusion when not run from a materialization context, requiring custom implementations for table creation functionality in dbt-fusion compatible packages.
Learnt from: haritamar
Repo: elementary-data/dbt-data-reliability PR: 825
File: macros/edr/tests/test_dimension_anomalies.sql:73-75
Timestamp: 2025-09-04T09:14:40.621Z
Learning: In the dbt-data-reliability codebase, `flatten_model` is a valid macro defined in `macros/edr/dbt_artifacts/upload_dbt_models.sql` and can be used as a parameter to `elementary.get_anomaly_query()` even though other test files use the `flattened_test` variable instead.
macros/utils/table_operations/delete_and_insert.sql (1)

62-62: Verify on_cluster_clause() implementation for cluster-aware operations.

The on_cluster_clause(relation) macro from dbt-clickhouse correctly enables cluster-aware delete operations for self-hosted ClickHouse installations. The macro returns an empty string for non-clustered setups, generating valid syntax in both cases: ALTER TABLE table_name DELETE for single-node and ALTER TABLE table_name ON CLUSTER 'cluster_name' DELETE for clustered deployments.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@StevenReitsma StevenReitsma changed the title Don't create temporary table if adapter has no support and add missing query settings Add ClickHouse cluster support, part 2 Dec 16, 2025
@arbiv
Copy link
Contributor

arbiv commented Dec 30, 2025

@StevenReitsma, thanks for opening the PR! It seems that there is an issue with the Databricks tests. Could you take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants