Skip to content

Conversation

@jc00ke
Copy link

@jc00ke jc00ke commented Nov 17, 2025

Hi there!

I spoke with the team at Coalesce about adding Vertica support. This is my initial attempt, and TBH, it's my initial foray into dbt packages, so I appreciate any and all help provided! 🙏

Summary by CodeRabbit

  • New Features

    • Official support for Vertica as a warehouse type, plus Vertica-specific SQL utilities for time arithmetic and safe string escaping.
  • Tests

    • CI matrix and workflows updated to include Vertica, a dbt-Vertica version input, and Vertica-specific environment/schema handling and startup steps.
  • Chores

    • Added local Vertica integration test service and new Vertica fixture profiles; updated gitignore for fixture user file.
  • Style

    • Minor whitespace trimming in template macros.

jc00ke added 4 commits October 8, 2025 16:59
Without trimming the leading and trailing newlines, Vertica would fail to parse the
compiled SQL. For example, `models/edr/dbt_artifacts/dbt_columns`
compiles the following SQL, via `elementary.get_dbt_columns_empty_table_query`, `empty_table` and `empty_column`:

```sql

select * from (
            select

        cast('dummy_string' as varchar(4096)) as unique_id

,

        cast('dummy_string' as varchar(4096)) as parent_unique_id

,

        cast('dummy_string' as varchar(4096)) as name

,

        cast('dummy_string' as varchar(4096)) as data_type

,

        cast('this_is_just_a_long_dummy_string' as varchar(4096)) as tags

,

        cast('this_is_just_a_long_dummy_string' as varchar(4096)) as meta

,

        cast('dummy_string' as varchar(4096)) as database_name

,

        cast('dummy_string' as varchar(4096)) as schema_name

,

        cast('dummy_string' as varchar(4096)) as table_name

,

        cast('this_is_just_a_long_dummy_string' as varchar(4096)) as description

,

        cast('dummy_string' as varchar(4096)) as resource_type

,

        cast('dummy_string' as varchar(4096)) as generated_at

,

        cast('dummy_string' as varchar(4096)) as metadata_hash

        ) as empty_table
        where 1 = 0
```

which would cause

```
SQL Error [4856] [42601]: [Vertica][VJDBC](4856) ERROR: Syntax error at or near ")" at character 1
```

By trimming the newlines, the SQL is much tighter:

```sql

select * from (
            select
        cast('dummy_string' as varchar(4096)) as unique_id,
        cast('dummy_string' as varchar(4096)) as parent_unique_id,
        cast('dummy_string' as varchar(4096)) as name,
        cast('dummy_string' as varchar(4096)) as data_type,
        cast('this_is_just_a_long_dummy_string' as varchar(4096)) as tags,
        cast('this_is_just_a_long_dummy_string' as varchar(4096)) as meta,
        cast('dummy_string' as varchar(4096)) as database_name,
        cast('dummy_string' as varchar(4096)) as schema_name,
        cast('dummy_string' as varchar(4096)) as table_name,
        cast('this_is_just_a_long_dummy_string' as varchar(4096)) as description,
        cast('dummy_string' as varchar(4096)) as resource_type,
        cast('dummy_string' as varchar(4096)) as generated_at,
        cast('dummy_string' as varchar(4096)) as metadata_hash
        ) as empty_table
        where 1 = 0
```

and this runs in Vertica just fine.
This fixed 4 or 5 errors when running in my test project.
@coderabbitai
Copy link

coderabbitai bot commented Nov 17, 2025

Walkthrough

Adds Vertica support across CI and tests: workflow matrix and inputs, conditional dbt-vertica install and Vertica startup steps, a docker-compose file for Vertica, two Vertica-specific macros (timeadd and escape_special_chars), and minor Jinja whitespace tightening.

Changes

Cohort / File(s) Change Summary
GitHub workflows
\.github/workflows/test-all-warehouses.yml, \.github/workflows/test-warehouse.yml
Add vertica to the warehouse matrix and workflow triggers; add dbt-vertica-version input; conditionally install dbt-vertica when warehouse-type == 'vertica' and exclude vertica from the generic install; set/export SCHEMA_NAME and substitute it into profiles; add Vertica env setup, Start/Wait Vertica steps using docker-compose-vertica.yml.
Integration tests / Docker Compose
integration_tests/docker-compose-vertica.yml
Add vertica service using image ghcr.io/ratiopbc/vertica-ce with connection and app env vars, ports, vertica-data volume, ulimits, container name, and a vsql healthcheck.
Vertica SQL macros
macros/utils/cross_db_utils/timeadd.sql, macros/utils/table_operations/insert_rows.sql
Add vertica__edr_timeadd(date_part, number, timestamp_expression) using Vertica timestampadd; add vertica__escape_special_chars(string_value) to escape single quotes.
Jinja whitespace adjustments
macros/edr/system/system_utils/empty_table.sql
Tighten Jinja whitespace control markers ({%- / -%}) around loops and conditionals; no behavioral changes.
Fixtures / Profiles
\.github/fixtures/profiles.yml
Add Vertica-based elementary and elementary_tests profile entries sourcing connection values from environment variables.
Repo ignore
\.gitignore
Ignore rule added for .github/fixtures/.user.yml.

Sequence Diagram(s)

sequenceDiagram
    participant User as User / CI Trigger
    participant WF as GitHub Workflow
    participant Installer as Adapter Installer
    participant Compose as docker-compose-vertica
    participant DBT as dbt Runner

    Note over User,WF: workflow_dispatch / workflow_call with inputs.warehouse-type="vertica"
    User->>WF: inputs (warehouse-type=vertica, dbt-vertica-version)
    WF->>Installer: if warehouse-type == "vertica" -> Install dbt-vertica
    Installer-->>WF: dbt-vertica installed
    WF->>Compose: Start Vertica (docker-compose-vertica.yml, env SCHEMA_NAME, creds)
    Compose-->>WF: Vertica healthy (healthcheck)
    WF->>DBT: write profiles (replace <SCHEMA_NAME>) and run dbt commands
    DBT-->>WF: test results
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Pay special attention to:
    • macros/utils/cross_db_utils/timeadd.sql — correctness of timestampadd usage, date_part casing, and casting.
    • macros/utils/table_operations/insert_rows.sql — parity of escaping behavior with other dialect macros.
    • integration_tests/docker-compose-vertica.yml and related workflow Start/Wait steps — image/ports/healthcheck and env propagation (SCHEMA_NAME).
    • Workflow conditionals and profile substitution logic in .github/workflows/test-warehouse.yml.

Possibly related PRs

Suggested reviewers

  • elazarlachkar
  • haritamar

Poem

🐰
I hopped into CI and nudged a bright schema,
Vertica woke with coffee and cream,
Quotes tucked safe, timestamps ticking true,
Containers stretched legs and ran the queue,
A rabbit applauds: hoppity, merge-time gleam! 🎉

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Vertica compat' is vague and generic, using abbreviated terminology that doesn't clearly convey what changes are being made to the codebase. Use a more descriptive title such as 'Add Vertica database support' or 'Add Vertica warehouse compatibility and integration tests' to clearly indicate the main change.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

👋 @jc00ke
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

@jc00ke
Copy link
Author

jc00ke commented Nov 17, 2025

I'm going to debug in our own Actions so no need to run the workflows here yet. I'll report back when I think it's actually ready to review!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 570dcd0 and c886e4c.

📒 Files selected for processing (2)
  • .github/workflows/test-warehouse.yml (6 hunks)
  • integration_tests/docker-compose-vertica.yml (1 hunks)
🔇 Additional comments (4)
integration_tests/docker-compose-vertica.yml (2)

20-21: Verify deployment mode configuration.

The deploy: mode: global setting is typically used in Docker Swarm mode to run a service on all nodes. For local development or CI environments using docker-compose, this setting is usually unnecessary and should either be removed or explicitly documented why it's needed.

For a single-instance Vertica container in CI, remove the deploy section:

-   deploy:
-     mode: global

Or document why this is necessary if it serves a specific purpose in your deployment model.


4-5: Remove hardcoded credentials from docker-compose file.

Credentials should not be committed to source control. Since this file is in the repository, the hardcoded VERTICA_USER, VERTICA_PASS, and app credentials are exposed and represent a security risk.

Additionally, the healthcheck command on Line 31 embeds credentials directly in the command string, which may be logged or exposed in logs/CI output.

Recommended approach: Use environment variable substitution to allow credentials to be injected via CI secrets or .env files:

  environment:
-   VERTICA_USER: dbadmin
-   VERTICA_PASS: vertica
+   VERTICA_USER: ${VERTICA_USER:-dbadmin}
+   VERTICA_PASS: ${VERTICA_PASS:-vertica}
    VERTICA_HOST: localhost
    VERTICA_PORT: 5433
    VERTICA_DATABASE: elementary_tests
    VERTICA_SCHEMA: ${SCHEMA_NAME}
-   APP_DB_USER: ${VERTICA_USER}
-   APP_DB_PASSWORD: ${VERTICA_PASS}
+   APP_DB_USER: ${APP_DB_USER:-${VERTICA_USER:-dbadmin}}
+   APP_DB_PASSWORD: ${APP_DB_PASSWORD:-${VERTICA_PASS:-vertica}}

And update the healthcheck to avoid embedding credentials:

  healthcheck:
-   test: ["CMD-SHELL", "/opt/vertica/bin/vsql -U dbadmin -w vertica -c 'SELECT 1;'"]
+   test: ["CMD-SHELL", "/opt/vertica/bin/vsql -U $${VERTICA_USER} -w $${VERTICA_PASS} -c 'SELECT 1;'"]

Note: In docker-compose, use $$ to escape environment variables in the shell command.

Also applies to: 10-11

⛔ Skipped due to learnings
Learnt from: GuyEshdat
Repo: elementary-data/dbt-data-reliability PR: 838
File: integration_tests/docker-compose-dremio.yml:36-44
Timestamp: 2025-08-10T11:29:23.299Z
Learning: When Docker Compose configurations are taken from official repositories (like dbt-dremio) and are working correctly in tests, prefer to keep them unchanged to maintain consistency with the official implementation, even if there are potential portability concerns with shell features like `/dev/tcp`.
Learnt from: GuyEshdat
Repo: elementary-data/dbt-data-reliability PR: 838
File: integration_tests/docker/dremio/dremio-setup.sh:3-11
Timestamp: 2025-08-10T11:29:20.155Z
Learning: When Docker setup scripts or configuration files are taken from official repositories (like dbt-dremio), they should be kept unchanged to maintain compatibility with the upstream source and ensure easier updates in the future, even if there are potential improvements that could be made.
.github/workflows/test-warehouse.yml (2)

163-165: Verify profiles.yml template consistency across all warehouse types.

The SCHEMA_NAME substitution now uses a sed command to replace <SCHEMA_NAME> placeholder in the profiles.yml content. This assumes all warehouse profiles (including Vertica profiles stored in secrets.CI_PROFILES_YML) contain this placeholder.

Please verify that the Vertica profile in your CI secrets (secrets.CI_PROFILES_YML) includes <SCHEMA_NAME> as a placeholder that can be substituted, consistent with the approach used for other warehouses. You can check this by examining your stored secrets configuration if accessible, or by confirming the placeholder is used consistently.


184-194: Verify Vertica startup and readiness check logic.

The startup steps look correct:

  • Line 187 references the correct docker-compose file (docker-compose-vertica.yml)
  • Line 193 uses docker inspect to check container health status, which aligns with the healthcheck defined in docker-compose (lines 30-34)
  • The timeout of 60 seconds matches the max healthcheck time (5s × 10 retries = 50s)

Please verify that:

  1. The container name in the docker inspect command (vertica) matches the container_name in docker-compose-vertica.yml (it should—currently both are vertica)
  2. The docker healthcheck status can reliably report as "healthy" after Vertica is fully initialized and accepting connections
  3. The 60-second timeout is sufficient for CI environments to start Vertica (you may need to adjust based on observed startup times in CI runs)

I use this for local dev via
`DBT_PROFILES_DIR="path/to/.github/fixtures/" and for GitHub Actions
secret `CI_PROFILES_YML`.

Linux+Wayland: `base64 .github/fixtures/profiles.yml | wl-copy`
MacOS: `base64 .github/fixtures/profiles.yml | pbcopy`
@jc00ke jc00ke temporarily deployed to elementary_test_env November 18, 2025 23:57 — with GitHub Actions Inactive
@jc00ke jc00ke had a problem deploying to elementary_test_env November 18, 2025 23:57 — with GitHub Actions Failure
@jc00ke jc00ke had a problem deploying to elementary_test_env November 18, 2025 23:57 — with GitHub Actions Failure
@jc00ke jc00ke temporarily deployed to elementary_test_env November 18, 2025 23:58 — with GitHub Actions Inactive
@jc00ke jc00ke temporarily deployed to elementary_test_env November 18, 2025 23:58 — with GitHub Actions Inactive
@jc00ke jc00ke had a problem deploying to elementary_test_env November 18, 2025 23:58 — with GitHub Actions Failure
@jc00ke jc00ke had a problem deploying to elementary_test_env November 18, 2025 23:58 — with GitHub Actions Failure
@jc00ke jc00ke had a problem deploying to elementary_test_env November 18, 2025 23:58 — with GitHub Actions Failure
@jc00ke jc00ke had a problem deploying to elementary_test_env November 18, 2025 23:58 — with GitHub Actions Failure
@jc00ke jc00ke temporarily deployed to elementary_test_env November 18, 2025 23:58 — with GitHub Actions Inactive
@jc00ke jc00ke temporarily deployed to elementary_test_env November 18, 2025 23:58 — with GitHub Actions Inactive
@jc00ke jc00ke had a problem deploying to elementary_test_env November 18, 2025 23:58 — with GitHub Actions Failure
@jc00ke jc00ke temporarily deployed to elementary_test_env November 18, 2025 23:58 — with GitHub Actions Inactive
@jc00ke jc00ke temporarily deployed to elementary_test_env November 18, 2025 23:58 — with GitHub Actions Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.github/fixtures/profiles.yml (1)

1-29: Consider using YAML anchors to eliminate profile duplication.

The elementary and elementary_tests profiles are structurally identical. Consider using YAML anchors and aliases to reduce duplication and improve maintainability:

.vertica_base: &vertica_base
  type: vertica
  host: "{{ env_var('VERTICA_HOST') }}"
  port: "{{ env_var('VERTICA_PORT') | as_number }}"
  username: "{{ env_var('VERTICA_USER') }}"
  password: "{{ env_var('VERTICA_PASS') }}"
  database: "{{ env_var('VERTICA_DATABASE') }}"
  schema: "{{ env_var('SCHEMA_NAME') }}"
  connection_load_balance: false
  retries: 2
  threads: 4

elementary:
  target: vertica
  outputs:
    vertica: *vertica_base

elementary_tests:
  target: vertica
  outputs:
    vertica: *vertica_base
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c886e4c and 2a996ef.

📒 Files selected for processing (3)
  • .github/fixtures/profiles.yml (1 hunks)
  • .gitignore (1 hunks)
  • integration_tests/docker-compose-vertica.yml (1 hunks)
🔇 Additional comments (5)
.gitignore (1)

7-7: Good practice to ignore fixture user files.

Gitignoring .github/fixtures/.user.yml is appropriate, as this file likely contains ephemeral or credentials-related data that should not be committed. This aligns with the security concern noted in the PR comments regarding fixture credential management.

integration_tests/docker-compose-vertica.yml (3)

20-21: Clarify the purpose of deploy: mode: global.

The deploy: mode: global setting is typically used for swarm-mode multi-node deployments. In a local testing context, this may be unnecessary or could cause unexpected behavior. Confirm whether this is intentional or should be removed for single-service test setup.


1-36: Ensure environment variable naming aligns across docker-compose and profiles.

The docker-compose defines VERTICA_SCHEMA (line 9) using ${SCHEMA_NAME}, but .github/fixtures/profiles.yml references SCHEMA_NAME directly in the schema field (lines 11, 26). Verify that the environment variable naming is consistent across both files and that CI/CD workflows properly set SCHEMA_NAME for both to reference.


16-16: Replace the third-party Vertica image with the official Vertica CE image.

The image ghcr.io/ratiopbc/vertica-ce is not an official Vertica (Community Edition) image; the official container images are vertica/vertica-ce and related images from Vertica/OpenText repositories. Using a third-party, unmaintained image for CI/CD testing introduces maintenance and security risks. Update line 16 to use the official image:

    image: vertica/vertica-ce

Verify the image tag and pull authentication in the official Vertica documentation to ensure compatibility with your test environment.

⛔ Skipped due to learnings
Learnt from: GuyEshdat
Repo: elementary-data/dbt-data-reliability PR: 838
File: integration_tests/docker-compose-dremio.yml:36-44
Timestamp: 2025-08-10T11:29:23.299Z
Learning: When Docker Compose configurations are taken from official repositories (like dbt-dremio) and are working correctly in tests, prefer to keep them unchanged to maintain consistency with the official implementation, even if there are potential portability concerns with shell features like `/dev/tcp`.
Learnt from: GuyEshdat
Repo: elementary-data/dbt-data-reliability PR: 838
File: integration_tests/docker/dremio/dremio-setup.sh:3-11
Timestamp: 2025-08-10T11:29:20.155Z
Learning: When Docker setup scripts or configuration files are taken from official repositories (like dbt-dremio), they should be kept unchanged to maintain compatibility with the upstream source and ensure easier updates in the future, even if there are potential improvements that could be made.
.github/fixtures/profiles.yml (1)

7-7: No issues found—code follows dbt-vertica best practices.

The | as_number filter is a built-in dbt/Jinja filter that dbt-vertica explicitly recommends for converting environment variables to integers for the port parameter. The code at lines 7 and 22 is correct and matches official documentation.

@jc00ke
Copy link
Author

jc00ke commented Nov 19, 2025

This is the most successful CI run I've gotten so far, and I'd say at this point I'm out of my depth when it comes to dbt and elementary.

If there are simple ways to test these failures, I can start looking into them, but I'd need some guidance. There's probably a few things that will fix most of those failures. Vertica is finicky about newlines and empty spaces at the beginning of lines, see 8b84461 for example.

@NoyaOffer
Copy link
Contributor

Hey @jc00ke ! Was great seeing you in Coalesce an appreciate the contribution 🙌
We'll take a deeper look and let you know if anything else is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants