Add exclude_detection_period_from_training flag to dimension anomaly test #890

devin-ai-integration · 2025-11-13T13:10:49Z

Add exclude_detection_period_from_training flag to dimension anomaly test

Summary

This PR adds the exclude_detection_period_from_training parameter to the test_dimension_anomalies macro and includes a parametrized integration test demonstrating its functionality. The parameter allows users to exclude recent detection period data from the training baseline, which helps detect anomalies that would otherwise be absorbed into the baseline.

Changes:

Added exclude_detection_period_from_training parameter to test_dimension_anomalies macro signature (defaults to false)
Passed the parameter through to get_anomalies_test_configuration
Added test_anomaly_in_detection_period integration test using @pytest.mark.parametrize to test both flag states

Test Behavior:
The integration test creates a scenario with:

30 days of normal data (45/50/55 Superman count pattern)
7 days of anomalous data (72 Superman per day) in the detection period

Expected outcomes:

exclude_detection=False → test PASSES (anomaly absorbed into training baseline)
exclude_detection=True → test FAILS (anomaly detected because excluded from training)

Review & Testing Checklist for Human

This is a YELLOW risk PR - the core logic was implemented in a previous session, but the integration test needs verification:

Verify test passes on all warehouses - Run the integration tests on Postgres, Redshift, Snowflake, BigQuery, and Databricks to ensure the parametrized test cases work correctly on all platforms
Confirm expected outcomes are correct - The test expects exclude_detection=False to PASS (anomaly missed) and exclude_detection=True to FAIL (anomaly detected). This mirrors the volume anomaly test behavior but may seem counterintuitive - verify this is the intended semantic
Check underlying implementation - Verify that get_anomalies_test_configuration and get_anomaly_scores_query actually respect the exclude_detection_period_from_training flag (this was implemented in a previous session but not verified in this PR)

Test Plan

Run integration tests: pytest integration_tests/tests/test_dimension_anomalies.py::test_anomaly_in_detection_period -v
Verify both parametrized cases pass: include_detection_in_training and exclude_detection_from_training
Check CI results across all warehouse types

Notes

Addresses reviewer feedback from @elazarlachkar to use parametrization instead of two separate test cases
Test is skipped on ClickHouse (anomalies not supported)
Devin session: https://app.devin.ai/sessions/069c9e74fdaa4e52af7bbb610bf0c747
Requested by: Yosef Arbiv (@arbiv)

Summary by CodeRabbit

New Features
- Added an optional setting to exclude the detection period from training for dimension-anomalies tests (disabled by default; configurable per test).
Tests
- Added an integration test that verifies behavior when the detection period is included vs excluded from training, asserting expected pass/fail outcomes.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…test - Added exclude_detection_period_from_training parameter to test_dimension_anomalies macro signature with default value false - Passed the parameter through to get_anomalies_test_configuration - This brings dimension anomalies in line with table/volume anomalies which already support this flag - The underlying logic in get_anomaly_scores_query.sql already handles this parameter for all anomaly types Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

linear · 2025-11-13T13:10:52Z

CORE-94 Add `exclude_detection_period_from_training` flag to dimension anomaly test

devin-ai-integration · 2025-11-13T13:10:53Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

github-actions · 2025-11-13T13:10:58Z

👋 @devin-ai-integration[bot]
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

coderabbitai · 2025-11-13T13:10:59Z

Walkthrough

Adds an optional exclude_detection_period_from_training parameter (default false) to the test_dimension_anomalies macro and threads it into the generated test configuration; adds a parameterized integration test test_anomaly_in_detection_period that verifies behavior when the flag is true vs false.

Changes

Cohort / File(s)	Summary
Macro: parameter & wiring `macros/edr/tests/test_dimension_anomalies.sql`	Macro signature updated to include `exclude_final_results, exclude_detection_period_from_training=false`; the new parameter is passed into the constructed test configuration so it propagates to downstream execution.
Integration test: new scenario `integration_tests/tests/test_dimension_anomalies.py`	New parameterized test `test_anomaly_in_detection_period` added; generates 30-day training data and a 7-day detection window with injected anomalies, toggles `exclude_detection_period_from_training`, and asserts pass/fail outcomes accordingly (skips on ClickHouse).

Sequence Diagram(s)

sequenceDiagram
    actor IT as Integration Test
    participant MAC as test_dimension_anomalies macro
    participant CFG as Generated Config
    participant EXEC as Test Executor

    Note over IT,MAC: Two scenarios: FLAG=false (include), FLAG=true (exclude)
    IT->>MAC: call test_dimension_anomalies(..., exclude_detection_period_from_training=FLAG)
    MAC->>CFG: construct config { ..., exclude_detection_period_from_training: FLAG }
    CFG->>EXEC: run anomaly detection test with config
    alt FLAG = false
        EXEC->>EXEC: include detection period in training (anomaly may be masked)
        EXEC-->>IT: test result (expected pass)
    else FLAG = true
        EXEC->>EXEC: exclude detection period from training (anomaly remains detectable)
        EXEC-->>IT: test result (expected fail)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Review focus:
- Verify macro signature change and correct propagation of exclude_detection_period_from_training.
- Validate integration test data generation, parameterization, assertions, and ClickHouse skip marker.

Possibly related PRs

Add exclude_detection_period_from_training flag to event freshness anomaly tests #888 — Adds and threads exclude_detection_period_from_training parameter through anomaly/freshness tests and wiring.
Add exclude_detection_period_from_training flag to freshness anomaly tests #885 — Similar propagation of the exclude_detection_period_from_training flag into anomaly test macros and configs.
Add exclude_detection_period_from_training flag to volume_anomalies test #877 — Threads exclude_detection_period_from_training through anomaly test configuration generation.

Suggested reviewers

GuyEshdat

Poem

🐰 I hopped into the tests today,
A tiny flag to hide or play,
Exclude the end so models learn,
Or include it in — then watch a turn,
I munch on code and hop away.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately captures the main change: adding an exclude_detection_period_from_training flag to the dimension anomaly test macro.
Linked Issues check	✅ Passed	The PR implements all CORE-94 requirements: adds the flag to the macro, wires it through configuration, includes a parametrized integration test validating both flag states with expected outcomes.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with CORE-94 objectives: macro parameter addition, configuration wiring, and integration test coverage of the new flag.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch devin/CORE-94-1763039312

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…mension anomalies - Added test_dimension_exclude_detection_from_training to demonstrate the flag's behavior - Test shows that without exclusion, anomaly is missed (test passes) because training includes the detection period - Test shows that with exclusion, anomaly is detected (test fails) because training excludes the detection period - Uses 30 days of normal data with variance (45/50/55 pattern) and 7 days of anomalous data (72/28 distribution) - Follows the same pattern as test_exclude_detection_from_training in test_volume_anomalies.py Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

integration_tests/tests/test_dimension_anomalies.py (1)
229-326: Excellent test implementation with clear documentation.

The test correctly validates the exclude_detection_period_from_training flag behavior with well-structured data and clear expectations. The logic is sound: when exclusion is disabled, the anomalous detection period data contaminates the training baseline (causing the test to pass/miss the anomaly), and when enabled, the clean baseline correctly detects the anomaly (causing the test to fail).

Optional: Consider minor consistency improvements:

Line 241 uses datetime.utcnow() while other tests in this file use datetime.utcnow().date() (e.g., lines 37, 59, 79). Although the code works correctly since the datetime is formatted to a date string, using .date() would improve consistency.

The test only validates pass/fail status but doesn't verify which dimension values were flagged as anomalous (unlike test_anomalous_dimension_anomalies at lines 105-114). Adding verification of anomaly details would make the test more thorough:
# After line 321, add:
anomaly_test_points = get_latest_anomaly_test_points(dbt_project, test_id + "_with_exclusion")
dimension_values = set([x["dimension_value"] for x in anomaly_test_points])
assert "Superman" in dimension_values, "Superman should be flagged as anomalous"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f83265c and 9872781.

📒 Files selected for processing (1)

integration_tests/tests/test_dimension_anomalies.py (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

📚 Learning: 2025-10-23T11:01:31.528Z

Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

Applied to files:

integration_tests/tests/test_dimension_anomalies.py

🧬 Code graph analysis (1)

integration_tests/tests/test_dimension_anomalies.py (2)

integration_tests/tests/conftest.py (2)

test_id (180-183)

dbt_project (144-147)

integration_tests/tests/dbt_project.py (4)

DbtProject (46-334)

test (110-128)

test (131-149)

test (151-257)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)

GitHub Check: test (latest_official, clickhouse) / test
GitHub Check: test (latest_official, snowflake) / test
GitHub Check: test (fusion, snowflake) / test
GitHub Check: test (fusion, redshift) / test
GitHub Check: test (fusion, databricks_catalog) / test
GitHub Check: test (latest_official, postgres) / test
GitHub Check: test (latest_official, trino) / test
GitHub Check: test (latest_official, bigquery) / test
GitHub Check: test (fusion, bigquery) / test
GitHub Check: test (latest_official, redshift) / test
GitHub Check: test (latest_pre, postgres) / test
GitHub Check: test (latest_official, databricks_catalog) / test
GitHub Check: test (latest_official, athena) / test
GitHub Check: test (latest_official, dremio) / test

…uffixes to avoid Postgres 63-char limit Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

integration_tests/tests/test_dimension_anomalies.py (1)

241-269: Clarify training/detection date arithmetic to avoid magic numbers

The 37 - i and 7 - i offsets work, but they’re non-obvious to readers and couple the loop math tightly to the period lengths. You could make this easier to reason about by deriving explicit training and detection start dates, e.g.:

-    utc_now = datetime.utcnow()
-
-    # Generate 30 days of normal data with variance (45/50/55 pattern for Superman)
-    normal_pattern = [45, 50, 55]
-    normal_data = []
-    for i in range(30):
-        date = utc_now - timedelta(days=37 - i)
+    utc_today = datetime.utcnow().date()
+    detection_start = utc_today - timedelta(days=7)
+    training_start = detection_start - timedelta(days=30)
+
+    # Generate 30 days of normal data with variance (45/50/55 pattern for Superman)
+    normal_pattern = [45, 50, 55]
+    normal_data = []
+    for i in range(30):
+        date = training_start + timedelta(days=i)
@@
-    # Generate 7 days of anomalous data (72 Superman, 28 Spiderman per day) - this will be in detection period
-    anomalous_data = []
-    for i in range(7):
-        date = utc_now - timedelta(days=7 - i)
+    # Generate 7 days of anomalous data (72 Superman, 28 Spiderman per day) - this will be in detection period
+    anomalous_data = []
+    for i in range(7):
+        date = detection_start + timedelta(days=i)

This keeps the intent (“30 days before detection”, “7 days of detection”) explicit and makes future adjustments to window sizes less error-prone, while preserving the same date ranges.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9872781 and b8b837a.

📒 Files selected for processing (1)

integration_tests/tests/test_dimension_anomalies.py (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

🧬 Code graph analysis (1)

integration_tests/tests/test_dimension_anomalies.py (1)

integration_tests/tests/dbt_project.py (4)

DbtProject (46-334)

test (110-128)

test (131-149)

test (151-257)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)

GitHub Check: test (latest_official, athena) / test
GitHub Check: test (latest_official, redshift) / test
GitHub Check: test (latest_official, databricks_catalog) / test
GitHub Check: test (fusion, redshift) / test
GitHub Check: test (fusion, snowflake) / test
GitHub Check: test (latest_official, bigquery) / test
GitHub Check: test (latest_official, snowflake) / test
GitHub Check: test (latest_official, postgres) / test
GitHub Check: test (fusion, databricks_catalog) / test
GitHub Check: test (latest_official, trino) / test
GitHub Check: test (latest_official, clickhouse) / test
GitHub Check: test (latest_official, dremio) / test
GitHub Check: test (fusion, bigquery) / test
GitHub Check: test (latest_pre, postgres) / test

🔇 Additional comments (1)

integration_tests/tests/test_dimension_anomalies.py (1)

223-326: Solid end-to-end coverage of exclude_detection_period_from_training behavior

This test cleanly exercises both paths (default include vs explicit exclude), uses distinct test_ids to isolate runs, and constructs training vs detection windows in a way that aligns with the configured training_period, detection_period, and time_bucket. Assertions on overall test status (pass vs fail) match the intended semantics of the new flag. I don’t see any functional or flakiness concerns here.

integration_tests/tests/test_dimension_anomalies.py

…tion_period - Rename test_dimension_exclude_detection_from_training to test_anomaly_in_detection_period - Add @pytest.mark.parametrize decorator with exclude_detection and expected_status parameters - Use descriptive IDs: include_detection_in_training and exclude_detection_from_training - Consolidate two test cases into one parametrized test for better maintainability - Addresses reviewer feedback on PR #890 Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

…move redundant suffix - Change parametrize IDs from 'include_detection_in_training'/'exclude_detection_from_training' to 'exclude_false'/'exclude_true' - Remove redundant suffix (_incl/_excl) since pytest parametrize IDs already differentiate test cases - New table names: test_anomaly_in_detection_period_exclude_false (44 chars) and test_anomaly_in_detection_period_exclude_true (43 chars) - Both are well under Postgres 63-character limit - Fixes CI failures on Postgres (latest_official and latest_pre) Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 370a4cb and fd75902.

📒 Files selected for processing (1)

integration_tests/tests/test_dimension_anomalies.py (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

📚 Learning: 2025-10-23T11:01:31.528Z

Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

Applied to files:

integration_tests/tests/test_dimension_anomalies.py

🧬 Code graph analysis (1)

integration_tests/tests/test_dimension_anomalies.py (2)

integration_tests/tests/conftest.py (2)

test_id (180-183)

dbt_project (144-147)

integration_tests/tests/dbt_project.py (4)

DbtProject (46-334)

test (110-128)

test (131-149)

test (151-257)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)

GitHub Check: test (latest_official, databricks_catalog) / test
GitHub Check: test (latest_official, trino) / test
GitHub Check: test (fusion, bigquery) / test
GitHub Check: test (latest_official, dremio) / test
GitHub Check: test (fusion, databricks_catalog) / test
GitHub Check: test (latest_official, snowflake) / test
GitHub Check: test (latest_official, athena) / test
GitHub Check: test (latest_official, postgres) / test
GitHub Check: test (fusion, snowflake) / test
GitHub Check: test (latest_official, bigquery) / test
GitHub Check: test (latest_pre, postgres) / test
GitHub Check: test (fusion, redshift) / test
GitHub Check: test (latest_official, redshift) / test
GitHub Check: test (latest_official, clickhouse) / test

🔇 Additional comments (5)

integration_tests/tests/test_dimension_anomalies.py (5)

223-239: LGTM! Well-structured parametrized test.

The test follows the suggestion from previous review comments to use @pytest.mark.parametrize with both flag states and expected outcomes. The documentation clearly explains the behavior, and the shortened IDs appropriately address the Postgres identifier length constraint.

240-256: LGTM! Clear function signature and comprehensive docstring.

The function signature correctly matches the parametrized inputs, and the docstring thoroughly documents the test scenario and expected behavior for both flag states.

282-300: LGTM! Anomalous data pattern is appropriate.

The anomalous data generation (72% Superman vs ~50% baseline) provides a clear deviation that should be detectable when the detection period is excluded from training. The 7-day range correctly aligns with the detection_period parameter.

304-313: Rewritten review comment is incorrect—the test configuration is valid.

The script output confirms that training_period and detection_period are not bypassing the backfill_days requirement. Instead, they are converted to backfill_days and days_back through dedicated macros in get_anomalies_test_configuration.sql (lines 40–48):

detection_period: {"period": "day", "count": 7} → backfill_days=7

training_period: {"period": "day", "count": 30} → days_back=30

The configuration processor then validates that backfill_days is set (line 90), which it is after conversion. Multiple tests across the codebase (test_volume_anomalies.py, test_anomaly_test_configuration.py) use this exact pattern successfully.

This is intentional design: the new API accepts user-friendly period specifications instead of raw day counts.

302-321: Based on the verification through the shell script output showing how exclude_detection_period_from_training works in the Elementary codebase, I can now provide an accurate assessment.

Review comment is based on incorrect understanding of the exclude_detection_period_from_training logic.

The original concern assumed that exclude_detection_period_from_training=True would exclude the full detection_period (7 days) from training. However, the actual implementation in get_anomaly_scores_query.sql shows that exclusion uses backfill_days (which defaults to 2 days), not detection_period. For incremental models, this is also the period for re-calculating metrics.

With the actual logic:

Exclusion window: last 2 days (backfill_days default)

Training data available: ~28 days out of the 37 days generated

This is sufficient for the 30-day training_period requested

The data generation is appropriate, and the concerns about insufficient training data do not apply.

Likely an incorrect or invalid review comment.

integration_tests/tests/test_dimension_anomalies.py

…tetime Change utc_now from datetime.utcnow() to datetime.utcnow().date() to match the pattern used in other tests. Date arithmetic already works correctly with date objects.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

integration_tests/tests/test_dimension_anomalies.py (1)
257-280: Normal data generation looks correct.

The datetime issue from the previous review has been properly resolved. The data generation creates 30 days of normal data with appropriate variance (45/50/55 pattern) placed before the detection period.

Optional suggestion: Consider adding a comment clarifying the date range calculation for readability:
# Generate 30 days of normal data (days -37 to -8, before detection period)
for i in range(30):
    date = utc_now - timedelta(days=37 - i)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between fd75902 and 311d543.

📒 Files selected for processing (1)

integration_tests/tests/test_dimension_anomalies.py (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

📚 Learning: 2025-10-23T11:01:31.528Z

Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

Applied to files:

integration_tests/tests/test_dimension_anomalies.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)

GitHub Check: test (latest_official, dremio) / test
GitHub Check: test (latest_official, redshift) / test
GitHub Check: test (fusion, snowflake) / test
GitHub Check: test (fusion, databricks_catalog) / test
GitHub Check: test (latest_official, athena) / test
GitHub Check: test (latest_official, clickhouse) / test
GitHub Check: test (latest_official, bigquery) / test
GitHub Check: test (latest_official, postgres) / test
GitHub Check: test (fusion, bigquery) / test
GitHub Check: test (latest_official, snowflake) / test
GitHub Check: test (fusion, redshift) / test
GitHub Check: test (latest_official, databricks_catalog) / test
GitHub Check: test (latest_official, trino) / test
GitHub Check: test (latest_pre, postgres) / test

🔇 Additional comments (4)

integration_tests/tests/test_dimension_anomalies.py (4)

223-239: LGTM! Well-structured parametrization.

The test parametrization is clear and follows best practices:

Descriptive parameter names that convey the test scenarios

Shortened IDs documented to avoid Postgres identifier limits

Clear comments explaining the expected behavior for each case

240-256: Excellent documentation!

The test function signature and docstring are well-crafted:

Clear, descriptive function name that reflects the test purpose

Comprehensive docstring explaining the scenario and expected outcomes

Helpful note about Postgres identifier limit

282-300: Anomalous data generation is well-designed.

The 7 days of anomalous data (72% Superman vs. normal ~50%) creates a clear distributional shift that should be detectable when the detection period is excluded from training. The date placement (days -7 to -1) correctly aligns with the intended detection period.

302-320: Test configuration and exclusion flag implementation are correct.

The test properly parametrizes the exclude_detection_period_from_training behavior:

When exclude_detection=False: flag not added → detection period included in training → anomaly absorbed → test passes

When exclude_detection=True: flag set to True → detection period excluded from training → anomaly detected → test fails

The macro implementations correctly support this flag through the entire call stack: test_dimension_anomalies.sql → get_anomalies_test_configuration.sql → get_anomaly_scores_query.sql, where metrics are properly excluded from training statistics when the flag is enabled.

The test data generation (30 days of 45/50/55 Superman pattern, then 7 days of 72/28 anomalous split) correctly establishes the scenario needed to verify the flag's functionality.

Please manually verify this test passes across all supported warehouse targets (Postgres, Snowflake, BigQuery) to confirm the end-to-end behavior matches the expected semantics.

coderabbitai bot reviewed Nov 13, 2025

View reviewed changes

Fix test_dimension_exclude_detection_from_training: shorten test ID s…

b8b837a

…uffixes to avoid Postgres 63-char limit Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

coderabbitai bot reviewed Nov 20, 2025

View reviewed changes

elazarlachkar reviewed Nov 20, 2025

View reviewed changes

integration_tests/tests/test_dimension_anomalies.py Outdated Show resolved Hide resolved

devin-ai-integration bot mentioned this pull request Nov 20, 2025

Add exclude_detection_period_from_training flag to column/dimension anomaly tests #891

Merged

4 tasks

coderabbitai bot reviewed Nov 20, 2025

View reviewed changes

integration_tests/tests/test_dimension_anomalies.py Outdated Show resolved Hide resolved

Fix test_anomaly_in_detection_period to use date object instead of da…

311d543

…tetime Change utc_now from datetime.utcnow() to datetime.utcnow().date() to match the pattern used in other tests. Date arithmetic already works correctly with date objects.

coderabbitai bot reviewed Nov 24, 2025

View reviewed changes

elazarlachkar approved these changes Nov 28, 2025

View reviewed changes

arbiv merged commit 15c3290 into master Nov 28, 2025
32 checks passed

arbiv deleted the devin/CORE-94-1763039312 branch November 28, 2025 07:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add exclude_detection_period_from_training flag to dimension anomaly test #890

Add exclude_detection_period_from_training flag to dimension anomaly test #890

Uh oh!

devin-ai-integration bot commented Nov 13, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

linear bot commented Nov 13, 2025

Uh oh!

devin-ai-integration bot commented Nov 13, 2025

Uh oh!

github-actions bot commented Nov 13, 2025

Uh oh!

coderabbitai bot commented Nov 13, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add exclude_detection_period_from_training flag to dimension anomaly test #890

Add exclude_detection_period_from_training flag to dimension anomaly test #890

Uh oh!

Conversation

devin-ai-integration bot commented Nov 13, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add exclude_detection_period_from_training flag to dimension anomaly test

Summary

Review & Testing Checklist for Human

Test Plan

Notes

Summary by CodeRabbit

Uh oh!

linear bot commented Nov 13, 2025

Uh oh!

devin-ai-integration bot commented Nov 13, 2025

🤖 Devin AI Engineer

Uh oh!

github-actions bot commented Nov 13, 2025

Uh oh!

coderabbitai bot commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

devin-ai-integration bot commented Nov 13, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 13, 2025 •

edited

Loading