-
Notifications
You must be signed in to change notification settings - Fork 121
Add exclude_detection_period_from_training flag to column anomalies tests #889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ests Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
👋 @devin-ai-integration[bot] |
WalkthroughAdded a boolean parameter Changes
Sequence Diagram(s)sequenceDiagram
participant Caller as Macro caller
participant Macro as column/all_columns_anomalies
participant Config as get_anomalies_test_configuration
participant Runner as Test runner
rect rgb(245,250,255)
Note over Macro: new param\nexclude_detection_period_from_training (default: false)
end
Caller->>Macro: invoke(..., exclude_detection_period_from_training=val)
Macro->>Config: get_anomalies_test_configuration(..., exclude_detection_period_from_training=val)
Config-->>Macro: test_configuration
Macro->>Runner: execute test using test_configuration
Runner-->>Caller: test result (PASS / FAIL)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro Disabled knowledge base sources:
📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (2)📓 Common learnings📚 Learning: 2025-10-23T11:01:31.528ZApplied to files:
🧬 Code graph analysis (1)integration_tests/tests/test_column_anomalies.py (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
🔇 Additional comments (3)
Comment |
….sql Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
Test demonstrates that: - When flag=False: detection period data is included in training baseline, preventing anomaly detection - When flag=True: detection period data is excluded from training baseline, enabling anomaly detection The test uses constrained time windows (1 day training/detection) to make the behavior deterministic. Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
integration_tests/tests/test_column_anomalies.py (2)
503-537: Consider extracting common test arguments to reduce duplication.The
test_args_falseandtest_args_truedictionaries are identical except for theexclude_detection_period_from_trainingparameter. This duplication could be reduced.Apply this diff to extract common arguments:
+ base_test_args = { + "timestamp_column": TIMESTAMP_COLUMN, + "column_anomalies": ["null_count"], + "time_bucket": {"period": "day", "count": 1}, + "training_period": {"period": "day", "count": 1}, + "detection_period": {"period": "day", "count": 1}, + "min_training_set_size": 1, + "anomaly_sensitivity": 3, + "anomaly_direction": "spike", + } + - test_args_false = { - "timestamp_column": TIMESTAMP_COLUMN, - "column_anomalies": ["null_count"], - "time_bucket": {"period": "day", "count": 1}, - "training_period": {"period": "day", "count": 1}, - "detection_period": {"period": "day", "count": 1}, - "min_training_set_size": 1, - "anomaly_sensitivity": 3, - "anomaly_direction": "spike", - "exclude_detection_period_from_training": False, - } + test_args_false = { + **base_test_args, + "exclude_detection_period_from_training": False, + } test_result_false = dbt_project.test( test_id, DBT_TEST_NAME, test_args_false, data=data, test_column="superhero", test_vars={"force_metrics_backfill": True}, ) assert test_result_false["status"] == "pass", ( "Expected PASS when exclude_detection_period_from_training=False " "(detection data included in training baseline)" ) - test_args_true = { - "timestamp_column": TIMESTAMP_COLUMN, - "column_anomalies": ["null_count"], - "time_bucket": {"period": "day", "count": 1}, - "training_period": {"period": "day", "count": 1}, - "detection_period": {"period": "day", "count": 1}, - "min_training_set_size": 1, - "anomaly_sensitivity": 3, - "anomaly_direction": "spike", - "exclude_detection_period_from_training": True, - } + test_args_true = { + **base_test_args, + "exclude_detection_period_from_training": True, + }
486-501: Consider adding a comment explaining the test scenario.The test uses
training_period: 1 daybut generates 30 days of training data. While this is correct (only the most recent day of training data is used based on the training_period parameter), a brief comment would clarify the test setup for future maintainers.Consider adding a comment like:
utc_today = datetime.utcnow().date() test_date, *training_dates = generate_dates(base_date=utc_today - timedelta(1)) # Generate 30 days of training data (Superman/Batman, no nulls) # With training_period=1 day, only the most recent training day will be used as baseline data: List[Dict[str, Any]] = [ ... ] # Add anomalous test data: 10 nulls on the detection day data += [ ... ]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
integration_tests/tests/test_column_anomalies.py(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.
📚 Learning: 2025-10-23T11:01:31.528Z
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.
Applied to files:
integration_tests/tests/test_column_anomalies.py
🧬 Code graph analysis (1)
integration_tests/tests/test_column_anomalies.py (3)
integration_tests/tests/conftest.py (2)
test_id(180-183)dbt_project(144-147)integration_tests/tests/dbt_project.py (4)
DbtProject(46-334)test(110-128)test(131-149)test(151-257)integration_tests/tests/data_generator.py (1)
generate_dates(6-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: test (fusion, snowflake) / test
- GitHub Check: test (latest_official, snowflake) / test
- GitHub Check: test (latest_official, postgres) / test
- GitHub Check: test (latest_official, bigquery) / test
- GitHub Check: test (fusion, bigquery) / test
- GitHub Check: test (latest_pre, postgres) / test
- GitHub Check: test (latest_official, clickhouse) / test
- GitHub Check: test (latest_official, redshift) / test
- GitHub Check: test (latest_official, trino) / test
- GitHub Check: test (latest_official, databricks_catalog) / test
- GitHub Check: test (fusion, redshift) / test
- GitHub Check: test (latest_official, athena) / test
- GitHub Check: test (fusion, databricks_catalog) / test
- GitHub Check: test (latest_official, dremio) / test
🔇 Additional comments (1)
integration_tests/tests/test_column_anomalies.py (1)
481-549: LGTM! Test correctly validates the exclude_detection_period_from_training feature.The test logic is sound and follows the pattern established by other tests in this file:
- When
exclude_detection_period_from_training=False, the detection period data (10 nulls) is included in the training baseline, making the nulls appear normal → PASS- When
exclude_detection_period_from_training=True, the detection period is excluded from the baseline (only 2 non-null rows from the training period), making the 10 nulls anomalous → FAILThe use of
force_metrics_backfill=Trueensures proper test isolation when callingdbt_project.testtwice with the sametest_id.
… more substantial dataset - Use 30 days of normal data with low null count (0-2 nulls/day) instead of 1 day - Use 7 days of anomalous data with high null count (20 nulls/day) instead of 1 day - Update training period to 30 days and detection period to 7 days - Add more data per day to create clearer anomaly signal - Use separate test IDs for the two test runs to avoid conflicts - Pattern matches successful volume and freshness anomalies tests Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
…data - Rename test function to fix Postgres 63-char table name limit - Update test data: 8-12 nulls/day normal, 20 nulls/day anomalous - Increase sensitivity to 5 to match volume anomalies pattern - Shorten test ID suffixes to _f/_t - Test now passes locally with postgres Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
Add exclude_detection_period_from_training flag to column anomalies tests
Summary
This PR adds the
exclude_detection_period_from_trainingparameter to column-level anomaly detection tests (column_anomaliesandall_columns_anomalies), bringing them to feature parity with table anomaly tests. The flag controls whether detection period data is included in the training baseline when calculating anomaly scores.Changes:
exclude_detection_period_from_training=falseparameter totest_column_anomalies.sqlmacroexclude_detection_period_from_training=falseparameter totest_all_columns_anomalies.sqlmacrotest_column_anomalies_exclude_detection_period_from_trainingto verify flag behaviorBehavior:
False(default): Detection period data is included in training baseline, maintaining backward compatibilityTrue: Detection period data is excluded from training baseline, making anomaly detection more sensitive to recent changesReview & Testing Checklist for Human
exclude_detection_period_from_training=False, including detection data in training will prevent anomaly detection. This assumption should be validated by examining the actual anomaly detection SQL logic inget_anomaly_scores_queryand related macros.min_training_set_size=1) that may behave differently across databases. Verify the test passes consistently on Postgres, Snowflake, BigQuery, Redshift, Databricks, etc.exclude_detection_period_from_training=falsevalue.exclude_detection_period_from_training=trueandfalseto verify the behavior matches expectations in a real-world scenario.Notes
%>instead of%}) that was fixed in commit 8c2588aforce_metrics_backfill=Truefor both test runs to avoid incremental metric caching issuesRequested by: Yosef Arbiv (yosef@elementary-data.com), @arbiv
Devin session: https://app.devin.ai/sessions/9129463181774101aa708c1e876174e6
Summary by CodeRabbit
Enhancements
Tests