-
Notifications
You must be signed in to change notification settings - Fork 121
Vertica compat #892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Vertica compat #892
Conversation
Without trimming the leading and trailing newlines, Vertica would fail to parse the
compiled SQL. For example, `models/edr/dbt_artifacts/dbt_columns`
compiles the following SQL, via `elementary.get_dbt_columns_empty_table_query`, `empty_table` and `empty_column`:
```sql
select * from (
select
cast('dummy_string' as varchar(4096)) as unique_id
,
cast('dummy_string' as varchar(4096)) as parent_unique_id
,
cast('dummy_string' as varchar(4096)) as name
,
cast('dummy_string' as varchar(4096)) as data_type
,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as tags
,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as meta
,
cast('dummy_string' as varchar(4096)) as database_name
,
cast('dummy_string' as varchar(4096)) as schema_name
,
cast('dummy_string' as varchar(4096)) as table_name
,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as description
,
cast('dummy_string' as varchar(4096)) as resource_type
,
cast('dummy_string' as varchar(4096)) as generated_at
,
cast('dummy_string' as varchar(4096)) as metadata_hash
) as empty_table
where 1 = 0
```
which would cause
```
SQL Error [4856] [42601]: [Vertica][VJDBC](4856) ERROR: Syntax error at or near ")" at character 1
```
By trimming the newlines, the SQL is much tighter:
```sql
select * from (
select
cast('dummy_string' as varchar(4096)) as unique_id,
cast('dummy_string' as varchar(4096)) as parent_unique_id,
cast('dummy_string' as varchar(4096)) as name,
cast('dummy_string' as varchar(4096)) as data_type,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as tags,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as meta,
cast('dummy_string' as varchar(4096)) as database_name,
cast('dummy_string' as varchar(4096)) as schema_name,
cast('dummy_string' as varchar(4096)) as table_name,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as description,
cast('dummy_string' as varchar(4096)) as resource_type,
cast('dummy_string' as varchar(4096)) as generated_at,
cast('dummy_string' as varchar(4096)) as metadata_hash
) as empty_table
where 1 = 0
```
and this runs in Vertica just fine.
This fixed 4 or 5 errors when running in my test project.
WalkthroughAdds Vertica support across CI and tests: workflow matrix and inputs, conditional dbt-vertica install and Vertica startup steps, a docker-compose file for Vertica, two Vertica-specific macros (timeadd and escape_special_chars), and minor Jinja whitespace tightening. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User / CI Trigger
participant WF as GitHub Workflow
participant Installer as Adapter Installer
participant Compose as docker-compose-vertica
participant DBT as dbt Runner
Note over User,WF: workflow_dispatch / workflow_call with inputs.warehouse-type="vertica"
User->>WF: inputs (warehouse-type=vertica, dbt-vertica-version)
WF->>Installer: if warehouse-type == "vertica" -> Install dbt-vertica
Installer-->>WF: dbt-vertica installed
WF->>Compose: Start Vertica (docker-compose-vertica.yml, env SCHEMA_NAME, creds)
Compose-->>WF: Vertica healthy (healthcheck)
WF->>DBT: write profiles (replace <SCHEMA_NAME>) and run dbt commands
DBT-->>WF: test results
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
👋 @jc00ke |
|
I'm going to debug in our own Actions so no need to run the workflows here yet. I'll report back when I think it's actually ready to review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
.github/workflows/test-warehouse.yml(6 hunks)integration_tests/docker-compose-vertica.yml(1 hunks)
🔇 Additional comments (4)
integration_tests/docker-compose-vertica.yml (2)
20-21: Verify deployment mode configuration.The
deploy: mode: globalsetting is typically used in Docker Swarm mode to run a service on all nodes. For local development or CI environments using docker-compose, this setting is usually unnecessary and should either be removed or explicitly documented why it's needed.For a single-instance Vertica container in CI, remove the deploy section:
- deploy: - mode: globalOr document why this is necessary if it serves a specific purpose in your deployment model.
4-5: Remove hardcoded credentials from docker-compose file.Credentials should not be committed to source control. Since this file is in the repository, the hardcoded
VERTICA_USER,VERTICA_PASS, and app credentials are exposed and represent a security risk.Additionally, the healthcheck command on Line 31 embeds credentials directly in the command string, which may be logged or exposed in logs/CI output.
Recommended approach: Use environment variable substitution to allow credentials to be injected via CI secrets or
.envfiles:environment: - VERTICA_USER: dbadmin - VERTICA_PASS: vertica + VERTICA_USER: ${VERTICA_USER:-dbadmin} + VERTICA_PASS: ${VERTICA_PASS:-vertica} VERTICA_HOST: localhost VERTICA_PORT: 5433 VERTICA_DATABASE: elementary_tests VERTICA_SCHEMA: ${SCHEMA_NAME} - APP_DB_USER: ${VERTICA_USER} - APP_DB_PASSWORD: ${VERTICA_PASS} + APP_DB_USER: ${APP_DB_USER:-${VERTICA_USER:-dbadmin}} + APP_DB_PASSWORD: ${APP_DB_PASSWORD:-${VERTICA_PASS:-vertica}}And update the healthcheck to avoid embedding credentials:
healthcheck: - test: ["CMD-SHELL", "/opt/vertica/bin/vsql -U dbadmin -w vertica -c 'SELECT 1;'"] + test: ["CMD-SHELL", "/opt/vertica/bin/vsql -U $${VERTICA_USER} -w $${VERTICA_PASS} -c 'SELECT 1;'"]Note: In docker-compose, use
$$to escape environment variables in the shell command.Also applies to: 10-11
⛔ Skipped due to learnings
Learnt from: GuyEshdat Repo: elementary-data/dbt-data-reliability PR: 838 File: integration_tests/docker-compose-dremio.yml:36-44 Timestamp: 2025-08-10T11:29:23.299Z Learning: When Docker Compose configurations are taken from official repositories (like dbt-dremio) and are working correctly in tests, prefer to keep them unchanged to maintain consistency with the official implementation, even if there are potential portability concerns with shell features like `/dev/tcp`.Learnt from: GuyEshdat Repo: elementary-data/dbt-data-reliability PR: 838 File: integration_tests/docker/dremio/dremio-setup.sh:3-11 Timestamp: 2025-08-10T11:29:20.155Z Learning: When Docker setup scripts or configuration files are taken from official repositories (like dbt-dremio), they should be kept unchanged to maintain compatibility with the upstream source and ensure easier updates in the future, even if there are potential improvements that could be made..github/workflows/test-warehouse.yml (2)
163-165: Verify profiles.yml template consistency across all warehouse types.The SCHEMA_NAME substitution now uses a sed command to replace
<SCHEMA_NAME>placeholder in the profiles.yml content. This assumes all warehouse profiles (including Vertica profiles stored insecrets.CI_PROFILES_YML) contain this placeholder.Please verify that the Vertica profile in your CI secrets (
secrets.CI_PROFILES_YML) includes<SCHEMA_NAME>as a placeholder that can be substituted, consistent with the approach used for other warehouses. You can check this by examining your stored secrets configuration if accessible, or by confirming the placeholder is used consistently.
184-194: Verify Vertica startup and readiness check logic.The startup steps look correct:
- Line 187 references the correct docker-compose file (
docker-compose-vertica.yml)- Line 193 uses docker inspect to check container health status, which aligns with the healthcheck defined in docker-compose (lines 30-34)
- The timeout of 60 seconds matches the max healthcheck time (5s × 10 retries = 50s)
Please verify that:
- The container name in the docker inspect command (
vertica) matches thecontainer_nameindocker-compose-vertica.yml(it should—currently both arevertica)- The docker healthcheck status can reliably report as "healthy" after Vertica is fully initialized and accepting connections
- The 60-second timeout is sufficient for CI environments to start Vertica (you may need to adjust based on observed startup times in CI runs)
I use this for local dev via `DBT_PROFILES_DIR="path/to/.github/fixtures/" and for GitHub Actions secret `CI_PROFILES_YML`. Linux+Wayland: `base64 .github/fixtures/profiles.yml | wl-copy` MacOS: `base64 .github/fixtures/profiles.yml | pbcopy`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
.github/fixtures/profiles.yml (1)
1-29: Consider using YAML anchors to eliminate profile duplication.The
elementaryandelementary_testsprofiles are structurally identical. Consider using YAML anchors and aliases to reduce duplication and improve maintainability:.vertica_base: &vertica_base type: vertica host: "{{ env_var('VERTICA_HOST') }}" port: "{{ env_var('VERTICA_PORT') | as_number }}" username: "{{ env_var('VERTICA_USER') }}" password: "{{ env_var('VERTICA_PASS') }}" database: "{{ env_var('VERTICA_DATABASE') }}" schema: "{{ env_var('SCHEMA_NAME') }}" connection_load_balance: false retries: 2 threads: 4 elementary: target: vertica outputs: vertica: *vertica_base elementary_tests: target: vertica outputs: vertica: *vertica_base
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
.github/fixtures/profiles.yml(1 hunks).gitignore(1 hunks)integration_tests/docker-compose-vertica.yml(1 hunks)
🔇 Additional comments (5)
.gitignore (1)
7-7: Good practice to ignore fixture user files.Gitignoring
.github/fixtures/.user.ymlis appropriate, as this file likely contains ephemeral or credentials-related data that should not be committed. This aligns with the security concern noted in the PR comments regarding fixture credential management.integration_tests/docker-compose-vertica.yml (3)
20-21: Clarify the purpose ofdeploy: mode: global.The
deploy: mode: globalsetting is typically used for swarm-mode multi-node deployments. In a local testing context, this may be unnecessary or could cause unexpected behavior. Confirm whether this is intentional or should be removed for single-service test setup.
1-36: Ensure environment variable naming aligns across docker-compose and profiles.The docker-compose defines
VERTICA_SCHEMA(line 9) using${SCHEMA_NAME}, but.github/fixtures/profiles.ymlreferencesSCHEMA_NAMEdirectly in the schema field (lines 11, 26). Verify that the environment variable naming is consistent across both files and that CI/CD workflows properly setSCHEMA_NAMEfor both to reference.
16-16: Replace the third-party Vertica image with the official Vertica CE image.The image
ghcr.io/ratiopbc/vertica-ceis not an official Vertica (Community Edition) image; the official container images arevertica/vertica-ceand related images from Vertica/OpenText repositories. Using a third-party, unmaintained image for CI/CD testing introduces maintenance and security risks. Update line 16 to use the official image:image: vertica/vertica-ceVerify the image tag and pull authentication in the official Vertica documentation to ensure compatibility with your test environment.
⛔ Skipped due to learnings
Learnt from: GuyEshdat Repo: elementary-data/dbt-data-reliability PR: 838 File: integration_tests/docker-compose-dremio.yml:36-44 Timestamp: 2025-08-10T11:29:23.299Z Learning: When Docker Compose configurations are taken from official repositories (like dbt-dremio) and are working correctly in tests, prefer to keep them unchanged to maintain consistency with the official implementation, even if there are potential portability concerns with shell features like `/dev/tcp`.Learnt from: GuyEshdat Repo: elementary-data/dbt-data-reliability PR: 838 File: integration_tests/docker/dremio/dremio-setup.sh:3-11 Timestamp: 2025-08-10T11:29:20.155Z Learning: When Docker setup scripts or configuration files are taken from official repositories (like dbt-dremio), they should be kept unchanged to maintain compatibility with the upstream source and ensure easier updates in the future, even if there are potential improvements that could be made..github/fixtures/profiles.yml (1)
7-7: No issues found—code follows dbt-vertica best practices.The
| as_numberfilter is a built-in dbt/Jinja filter that dbt-vertica explicitly recommends for converting environment variables to integers for the port parameter. The code at lines 7 and 22 is correct and matches official documentation.
|
This is the most successful CI run I've gotten so far, and I'd say at this point I'm out of my depth when it comes to dbt and elementary. If there are simple ways to test these failures, I can start looking into them, but I'd need some guidance. There's probably a few things that will fix most of those failures. Vertica is finicky about newlines and empty spaces at the beginning of lines, see 8b84461 for example. |
|
Hey @jc00ke ! Was great seeing you in Coalesce an appreciate the contribution 🙌 |
Hi there!
I spoke with the team at Coalesce about adding Vertica support. This is my initial attempt, and TBH, it's my initial foray into dbt packages, so I appreciate any and all help provided! 🙏
Summary by CodeRabbit
New Features
Tests
Chores
Style