Skip to content

Conversation

@StevenReitsma
Copy link
Contributor

@StevenReitsma StevenReitsma commented Oct 23, 2025

These changes make sure Elementary works successfully on ClickHouse clusters (i.e. non-single node setups that are self-hosted, so not on ClickHouse Cloud).

It fixes two separate issues:

  1. The can_on_cluster and can_exchange booleans were not passed through to temp tables.
  2. The ClickHouse-specific SETTINGS clause was not inserted in some queries. There might be more queries where this clause is missing, but the ones I added are the most important for clustered setups.

I'm not too familiar with adapter-specific dbt macro development, so I'm not sure if what I built is the way to go. However, I can confirm that Elementary now works successfully on a clustered ClickHouse setup.

Summary by CodeRabbit

Release Notes

  • Chores
    • Added support for per-model, database-specific query settings that are injected into generated queries.
    • Integrated query settings into insert and delete operations to ensure consistent execution behavior.
    • Improved ClickHouse temporary relation handling by preserving additional relation properties for correct runtime behavior.

@coderabbitai
Copy link

coderabbitai bot commented Oct 23, 2025

Walkthrough

Adds a dispatchable macro for per-model query settings and integrates it into ClickHouse-specific SQL generation for delete, insert, and insert-rows; also copies two boolean properties into ClickHouse temporary relation creation.

Changes

Cohort / File(s) Summary
Query settings abstraction
macros/utils/cross_db_utils/get_query_settings.sql
New macros: get_query_settings() delegates via adapter.dispatch; default__get_query_settings() returns empty string; clickhouse__get_query_settings() returns adapter.get_model_query_settings(model).
Delete & Insert SQL generation
macros/utils/table_operations/delete_and_insert.sql
ClickHouse branch now appends model query settings to the DELETE statement and prepends model query settings to the INSERT statement via adapter.get_model_query_settings(model).
Insert rows SQL generation
macros/utils/table_operations/insert_rows.sql
Injects {{ elementary.get_query_settings() }} into the INSERT ... VALUES template (before VALUES).
Temporary relation creation (ClickHouse)
macros/utils/table_operations/make_temp_relation.sql
ClickHouse temp relation now copies can_on_cluster and can_exchange boolean properties from the base relation.

Sequence Diagram(s)

sequenceDiagram
    participant Macro as caller macro
    participant GetQS as get_query_settings()
    participant Adapter as adapter.dispatch
    participant ClickHouse as clickhouse__get_query_settings
    participant Default as default__get_query_settings
    participant SQLGen as SQL generation macros

    Macro->>GetQS: call get_query_settings(model)
    GetQS->>Adapter: adapter.dispatch("get_query_settings", adapter_name)(model)
    alt adapter = clickhouse
        Adapter->>ClickHouse: invoke clickhouse__get_query_settings(model)
        ClickHouse-->>GetQS: return model-specific settings
    else other adapters
        Adapter->>Default: invoke default__get_query_settings(model)
        Default-->>GetQS: return ""
    end
    GetQS-->>Macro: settings string

    note right of SQLGen: Integration points
    Macro->>SQLGen: build DELETE/INSERT/INSERT_ROWS SQL
    SQLGen->>GetQS: retrieve settings
    SQLGen-->>Macro: emit SQL with settings injected
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Pay attention to adapter.dispatch usage and correct argument passing.
  • Verify ClickHouse adapter.get_model_query_settings(model) returns safe SQL fragments for direct injection.
  • Review interaction points where settings are inserted (DELETE, INSERT, INSERT ... VALUES).
  • Confirm can_on_cluster and can_exchange semantics and that copying them preserves intended behavior.

Poem

🐇 I hop, I patch, I stitch a query tune,
Settings dispatched beneath the ClickHouse moon,
Inserts and deletes now wear a tiny crown,
Temp tables borrow flags before settling down,
A rabbit’s fix — light, nimble, and soon.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main objective: enabling Elementary to work with ClickHouse cluster mode, which aligns with the PR's primary purpose of fixing cluster-specific issues.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

👋 @StevenReitsma
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 20b7637 and f43acef.

📒 Files selected for processing (4)
  • macros/utils/cross_db_utils/get_query_settings.sql (1 hunks)
  • macros/utils/table_operations/delete_and_insert.sql (1 hunks)
  • macros/utils/table_operations/insert_rows.sql (1 hunks)
  • macros/utils/table_operations/make_temp_relation.sql (1 hunks)
🔇 Additional comments (3)
macros/utils/table_operations/make_temp_relation.sql (1)

65-67: Correctly propagates ClickHouse cluster metadata to temporary relations.

The addition of can_on_cluster and can_exchange properties ensures temporary tables preserve cluster-related attributes needed for ClickHouse distributed operations. This directly addresses the first issue mentioned in the PR description.

macros/utils/table_operations/insert_rows.sql (1)

66-68: Injecting query settings for adapter-specific behavior.

The placement of {{ elementary.get_query_settings() }} between the column list and VALUES clause is correct. For non-ClickHouse adapters, default__get_query_settings() returns an empty string, so there's no impact. The injection happens in base_insert_query, which is reused consistently across chunked inserts.

Note: This relies on the global model context variable being available during macro execution, which is standard in dbt model runs. Verify this pattern holds if these macros are called outside typical model execution contexts.

macros/utils/cross_db_utils/get_query_settings.sql (1)

1-11: Establishes dispatch pattern for adapter-specific query settings.

The macro structure is sound: dispatches to adapter-specific implementations with a sensible default (empty string). The ClickHouse variant delegates to adapter.get_model_query_settings(model).

Verify that adapter.get_model_query_settings() is a documented and real method available on all adapters used by Elementary. If this method doesn't exist universally, you may need a defensive fallback or adapter-specific guards.

Comment on lines +64 to +65
or {{ delete_column_key }} in (select {{ delete_column_key }} from {{ delete_relation }})
{{ adapter.get_model_query_settings(model) }};
Copy link

@coderabbitai coderabbitai bot Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

⚠️ Inconsistent approach to injecting query settings; verify ClickHouse SQL syntax.

Two issues here:

  1. Inconsistent pattern: This file directly calls adapter.get_model_query_settings(model) while insert_rows.sql uses the dispatch macro elementary.get_query_settings(). Both target the same goal (injecting ClickHouse SETTINGS) but through different mechanisms. Standardize on one approach—the dispatch pattern is preferable for extensibility.

  2. SQL syntax concern (line 72): The insert query places the SETTINGS clause between INSERT INTO and SELECT:

    insert into {{ relation }} {{ adapter.get_model_query_settings(model) }} select * from {{ insert_relation }};
    

    ClickHouse typically expects SETTINGS at the very end of the query. Verify this syntax is valid for ClickHouse INSERT...SELECT...SETTINGS statements. If not, move the SETTINGS clause to the end of the query.

Also applies to: 72-72


I'll verify the concerns in this review comment by examining the actual files and checking ClickHouse syntax.


🏁 Script executed:

#!/bin/bash
# Locate and inspect the SQL files mentioned
echo "=== Locating delete_and_insert.sql ==="
fd -type f -name "delete_and_insert.sql"

echo -e "\n=== Locating insert_rows.sql ==="
fd -type f -name "insert_rows.sql"

Length of output: 414


🌐 Web query:

ClickHouse INSERT SELECT SETTINGS clause syntax location

💡 Result:

Put the SETTINGS clause immediately after the table name and optional column list (i.e., before VALUES, FORMAT, or SELECT).
You can also set query-level options inside the SELECT itself using its own SETTINGS clause. [1][2]

Sources:
[1] ClickHouse INSERT INTO statement (syntax and note about SETTINGS placement).
[2] ClickHouse SELECT — SETTINGS in SELECT queries.


Let me correct the shell script and inspect both files:


🏁 Script executed:

#!/bin/bash
# Find and inspect both SQL files
echo "=== Finding delete_and_insert.sql ==="
find . -name "delete_and_insert.sql" -type f 2>/dev/null | head -5

echo -e "\n=== Finding insert_rows.sql ==="
find . -name "insert_rows.sql" -type f 2>/dev/null | head -5

echo -e "\n=== Content of delete_and_insert.sql (around lines 64-72) ==="
delete_file=$(find . -name "delete_and_insert.sql" -type f 2>/dev/null | head -1)
if [ -n "$delete_file" ]; then
  sed -n '60,75p' "$delete_file" | cat -n
fi

echo -e "\n=== Content of insert_rows.sql (full file for inspection) ==="
insert_file=$(find . -name "insert_rows.sql" -type f 2>/dev/null | head -1)
if [ -n "$insert_file" ]; then
  cat "$insert_file" | cat -n
fi

echo -e "\n=== Search for get_query_settings macro definition ==="
rg -n "get_query_settings" --type sql

Length of output: 11002


Let me examine the macro implementations to verify if the inconsistency claim is valid and check the dispatch patterns:


🏁 Script executed:

#!/bin/bash
# Inspect get_query_settings macro definition and implementations
echo "=== get_query_settings macro definitions ==="
cat ./macros/utils/cross_db_utils/get_query_settings.sql

echo -e "\n=== Search for get_model_query_settings usage and definition ==="
rg -n "get_model_query_settings" -C 3

echo -e "\n=== Check if get_query_settings macro is used elsewhere ==="
rg -n "elementary.get_query_settings\|get_query_settings" --type sql -C 2 | head -50

Length of output: 2290


Standardize on dispatch macro pattern for query settings.

This file directly calls adapter.get_model_query_settings(model) (lines 65, 72) while insert_rows.sql uses the dispatch macro elementary.get_query_settings(). Both target the same goal for ClickHouse, but the dispatch pattern (used in insert_rows.sql) is preferable—it provides adapter-agnostic abstraction and returns an empty string for non-ClickHouse adapters, while direct calls may fail on unsupported adapters.

Replace adapter.get_model_query_settings(model) with elementary.get_query_settings() to match the pattern used elsewhere in the codebase.

🤖 Prompt for AI Agents
In macros/utils/table_operations/delete_and_insert.sql around lines 64-65 (and
also update the similar call at line 72), replace the direct call to
adapter.get_model_query_settings(model) with the dispatch macro
elementary.get_query_settings() so the code uses the adapter-agnostic pattern
used in insert_rows.sql; update both occurrences accordingly to call
elementary.get_query_settings() with the same model argument so ClickHouse gets
its settings and other adapters safely receive an empty string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see this as an issue since this code is already in ClickHouse-adapter specific code. But can change if wanted from a human reviewer :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StevenReitsma You're absolutely right! I apologize for the overly prescriptive suggestion.

The key distinction is that your code is in the clickhouse__get_delete_and_insert_queries macro, which is already adapter-specific. In adapter-specific macros, directly calling adapter methods like adapter.get_model_query_settings(model) is perfectly reasonable and arguably more direct.

The dispatch pattern (elementary.get_query_settings()) is most valuable in generic/cross-adapter code where you need a single implementation to work across multiple database types. In insert_rows.sql, the dispatch is used in the base get_insert_rows_queries macro (not an adapter override), which is why the pattern makes sense there.

Your current approach is clean and appropriate for adapter-specific code. No changes needed here! 👍

@NoyaOffer
Copy link
Contributor

Hey @StevenReitsma ! Thanks a lot for this PR.
I'll have someone from the team review it shortly 🙏

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
macros/utils/cross_db_utils/get_query_settings.sql (1)

1-11: Add documentation explaining the purpose and ClickHouse-specific context.

The macro lacks inline comments explaining its purpose. Given that this is adapter-specific functionality addressing a ClickHouse cluster requirement (per PR description), adding a brief docstring would clarify intent for future maintainers.

+{# 
+  Return adapter-specific query settings for model execution.
+  ClickHouse adapter uses this to inject SETTINGS clause required for cluster-mode compatibility.
+  Other adapters return empty string (no-op).
+#}
{% macro get_query_settings() %}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f43acef and 0abcdc4.

📒 Files selected for processing (1)
  • macros/utils/cross_db_utils/get_query_settings.sql (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-05T16:25:01.719Z
Learnt from: haritamar
Repo: elementary-data/dbt-data-reliability PR: 825
File: macros/utils/cross_db_utils/incremental_strategy.sql:13-15
Timestamp: 2025-10-05T16:25:01.719Z
Learning: In macros/utils/cross_db_utils/incremental_strategy.sql, the get_default_incremental_strategy() macro and its adapter-specific variants are used exclusively for Elementary's internal dbt artifact tables, all of which have unique keys defined. Therefore, defaulting to "merge" strategy for Redshift (and other adapters) is safe in this context.

Applied to files:

  • macros/utils/cross_db_utils/get_query_settings.sql
🔇 Additional comments (1)
macros/utils/cross_db_utils/get_query_settings.sql (1)

1-11: Address incomplete macro integration and variable scoping in get_query_settings macro.

Two issues were identified during verification:

  1. Incomplete Integration: delete_and_insert.sql still uses direct adapter.get_model_query_settings(model) calls instead of the new elementary.get_query_settings() macro. This creates inconsistency—insert_rows.sql correctly uses the macro, but delete_and_insert.sql does not. Both should use the macro for consistency and to centralize adapter-specific logic.

  2. Variable Scoping Concern: The get_query_settings macro references the model variable without explicit parameter passing. While this works in dbt's execution context, it relies on implicit global scope. The macro should either accept model as an explicit parameter or include a safeguard for cases where model may be undefined.

Recommended Actions:

  • Update delete_and_insert.sql to call {{ elementary.get_query_settings() }} instead of direct adapter calls (matching the pattern in insert_rows.sql)
  • Consider making model an explicit parameter to get_query_settings() for robustness, or add defensive checks
⛔ Skipped due to learnings
Learnt from: haritamar
Repo: elementary-data/dbt-data-reliability PR: 825
File: macros/utils/cross_db_utils/incremental_strategy.sql:13-15
Timestamp: 2025-10-05T16:25:01.719Z
Learning: In macros/utils/cross_db_utils/incremental_strategy.sql, the get_default_incremental_strategy() macro and its adapter-specific variants are used exclusively for Elementary's internal dbt artifact tables, all of which have unique keys defined. Therefore, defaulting to "merge" strategy for Redshift (and other adapters) is safe in this context.
Learnt from: GuyEshdat
Repo: elementary-data/dbt-data-reliability PR: 838
File: macros/utils/table_operations/insert_rows.sql:156-0
Timestamp: 2025-08-10T11:28:43.632Z
Learning: In the Elementary dbt-data-reliability codebase, the `dremio__escape_special_chars` macro in `macros/utils/table_operations/insert_rows.sql` receives strings where single quotes are already escaped as `\'` from upstream processing. The macro correctly converts these to Dremio's SQL escaping format by replacing `\'` with `''`. This is different from other database implementations that handle bare single quotes directly.

@StevenReitsma
Copy link
Contributor Author

@NoyaOffer any news?

@arbiv arbiv merged commit 789a687 into elementary-data:master Nov 19, 2025
84 of 90 checks passed
@arbiv
Copy link
Contributor

arbiv commented Nov 19, 2025

I merged the PR. It should be included in the next release.
@StevenReitsma thanks for your contribution!

@StevenReitsma
Copy link
Contributor Author

Awesome, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants