Skip to content

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Nov 10, 2025

Add exclude_detection_period_from_training flag to freshness anomaly tests

Summary

Added the exclude_detection_period_from_training parameter to test_freshness_anomalies and test_event_freshness_anomalies macros, matching the functionality already available in volume anomaly tests. This flag allows users to exclude metrics from the detection period when calculating the training baseline, preventing contamination from potentially anomalous recent data.

Changes:

  • Added exclude_detection_period_from_training=false parameter to both freshness test macros
  • Passed the parameter through to the underlying test_table_anomalies macro
  • Created integration test test_exclude_detection_from_training that validates the flag works correctly

Integration test approach:

  • Generates 30 days of normal data (updates every 2 hours) + 7 days of anomalous data (1 update per day at noon)
  • Uses 30-day training period overlapping with 7-day detection period
  • Without flag: anomaly included in training → test passes (anomaly missed)
  • With flag: anomaly excluded from training → test fails (anomaly detected)

Review & Testing Checklist for Human

  • Verify test scenario validity: The test assumes that with training_period=30 days and detection_period=7 days, the detection data is included in training by default when the flag is off. Confirm this matches the actual behavior of freshness anomaly detection (I based this on the volume anomalies test pattern, but freshness metrics may behave differently).

  • Test with real data: Run a freshness anomaly test on actual data with and without the flag to verify it changes behavior as expected. The integration test uses synthetic data with a specific pattern that may not represent all real-world scenarios.

  • Review test parameters: The test uses sensitivity=3, backfill_days=0, and specific time windows. Consider if these are appropriate defaults or if the test should validate other parameter combinations.

Notes

Summary by CodeRabbit

  • New Features

    • Added an optional exclude_detection_period_from_training parameter (default: false) to freshness and event-freshness anomaly tests to allow excluding the detection period from training data.
  • Tests

    • Added integration test coverage validating behavior with and without the exclusion across representative detection scenarios; the new test is skipped for ClickHouse targets.

…maly tests

- Added exclude_detection_period_from_training parameter to test_freshness_anomalies
- Added exclude_detection_period_from_training parameter to test_event_freshness_anomalies
- Both parameters default to false for backward compatibility
- This allows users to exclude the detection period from training data when detecting freshness anomalies

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
@linear
Copy link

linear bot commented Nov 10, 2025

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@coderabbitai
Copy link

coderabbitai bot commented Nov 10, 2025

Walkthrough

Two EDR freshness test macros add an optional parameter exclude_detection_period_from_training (default false) that is forwarded to elementary.test_table_anomalies. An integration test test_exclude_detection_from_training was added to exercise the flag (skipped for ClickHouse).

Changes

Cohort / File(s) Summary
EDR Freshness Test Macros
macros/edr/tests/test_event_freshness_anomalies.sql, macros/edr/tests/test_freshness_anomalies.sql
Added optional parameter exclude_detection_period_from_training (default false) to macro signatures and passed it through to elementary.test_table_anomalies as exclude_detection_period_from_training=exclude_detection_period_from_training.
Integration Tests
integration_tests/tests/test_freshness_anomalies.py
Added test_exclude_detection_from_training which generates normal and anomalous data, runs the freshness test twice (flag disabled and enabled), and asserts expected pass/fail outcomes; test is decorated to skip for ClickHouse targets.

Sequence Diagram(s)

sequenceDiagram
    participant Tester
    participant PyTest
    participant EDR_Macro as "EDR Macro\n(freshness/event_freshness_anomalies)"
    participant Elementary as "elementary.test_table_anomalies"
    participant Trainer as "Training Logic"

    Tester->>PyTest: run test_exclude_detection_from_training
    PyTest->>EDR_Macro: invoke macro(..., exclude_detection_period_from_training=false)
    EDR_Macro->>Elementary: test_table_anomalies(..., exclude_detection_period_from_training=false)
    Elementary->>Trainer: compute training_period (detection included)
    Trainer-->>Elementary: return model/thresholds
    Elementary-->>EDR_Macro: return test result (expected pass)
    EDR_Macro-->>PyTest: assert pass

    PyTest->>EDR_Macro: invoke macro(..., exclude_detection_period_from_training=true)
    EDR_Macro->>Elementary: test_table_anomalies(..., exclude_detection_period_from_training=true)
    Elementary->>Trainer: compute training_period (detection excluded)
    Trainer-->>Elementary: return model/thresholds
    Elementary-->>EDR_Macro: return test result (expected fail)
    EDR_Macro-->>PyTest: assert fail
    PyTest-->>Tester: report outcomes
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Review focus:
    • integration_tests/tests/test_freshness_anomalies.py — verify date generation, parameters, skip decorator, and assertions.
    • macros/edr/tests/test_event_freshness_anomalies.sql and macros/edr/tests/test_freshness_anomalies.sql — confirm new macro parameter, default value, and correct propagation into elementary.test_table_anomalies.

Possibly related PRs

Suggested reviewers

  • GuyEshdat

Poem

🐰 I tucked a tiny flag inside,
Two runs show whether anomalies hide.
One trains with all the days in sight,
The other keeps the detection light.
Hop, test, and nibble — the checks delight.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding an exclude_detection_period_from_training flag to freshness anomaly test macros, which aligns with the changeset modifications.
Linked Issues check ✅ Passed The PR successfully implements all coding objectives from CORE-81: adds exclude_detection_period_from_training parameter to both freshness anomaly macros [test_freshness_anomalies, test_event_freshness_anomalies], threads the parameter through to test_table_anomalies, provides a default value of false for backward compatibility, and includes integration tests validating the flag's behavior.
Out of Scope Changes check ✅ Passed All changes are directly related to the stated objective of adding exclude_detection_period_from_training flag to freshness anomaly tests; no unrelated modifications to other components or features were introduced.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/CORE-81-1762781203

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 294b6ca and 0473f90.

📒 Files selected for processing (1)
  • integration_tests/tests/test_freshness_anomalies.py (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.
📚 Learning: 2025-10-23T11:01:31.528Z
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

Applied to files:

  • integration_tests/tests/test_freshness_anomalies.py
🧬 Code graph analysis (1)
integration_tests/tests/test_freshness_anomalies.py (3)
integration_tests/tests/conftest.py (2)
  • test_id (180-183)
  • dbt_project (144-147)
integration_tests/tests/dbt_project.py (1)
  • DbtProject (46-334)
integration_tests/tests/data_generator.py (1)
  • generate_dates (6-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: test (fusion, snowflake) / test
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (fusion, redshift) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, athena) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (latest_official, postgres) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (fusion, databricks_catalog) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (latest_official, redshift) / test
  • GitHub Check: test (latest_official, clickhouse) / test
  • GitHub Check: test (fusion, bigquery) / test
🔇 Additional comments (5)
integration_tests/tests/test_freshness_anomalies.py (5)

238-248: LGTM! Test setup is clear and consistent.

The test function follows the same pattern as other tests in the file with the ClickHouse skip decorator. The docstring clearly describes the test scenario, and normalizing utc_now to midnight ensures consistent date boundaries for the test windows.


249-257: Normal data generation is correct.

The generate_dates helper generates dates backward from base_date (day -8) for 7 days with 2-hour steps, producing ~84 data points covering days -14 to -8 chronologically. This creates the "normal" baseline with frequent updates.


259-267: Anomalous data generation creates the correct pattern.

Generating 7 days with only 1 update per day (at noon) from day -1 backward creates the anomalous pattern that should be detected when excluded from training. The data ranges (days -7 to -1 for anomalous, days -14 to -8 for normal) don't overlap, ensuring clean separation between the two patterns.


271-301: Test configuration and first assertion are correct.

The test parameters are well-chosen for this scenario:

  • 14-day training + 7-day detection periods properly exercise the exclusion flag
  • anomaly_direction: "spike" correctly detects increases in time-between-updates (worse freshness)
  • ignore_small_changes thresholds at 0 ensure the anomaly is detected when appropriate
  • The assertion correctly expects "pass" when the detection window is included in training (anomaly becomes part of the baseline)

303-320: Second test correctly validates the exclusion flag behavior.

By using the same data and only toggling exclude_detection_period_from_training, the test cleanly demonstrates the flag's effect:

  • With exclusion: training uses only normal data (days -8 to -14, frequent updates), so the anomalous pattern in the detection window (days -7 to -1, infrequent updates) is correctly detected as anomalous → test fails ✓

The test design is excellent—it validates that the flag works as intended while maintaining simplicity and clarity.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

👋 @devin-ai-integration[bot]
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

…freshness anomalies

- Test validates that without the flag, anomalies in detection period are included in training (test passes, anomaly missed)
- Test validates that with the flag, anomalies in detection period are excluded from training (test fails, anomaly detected)
- Follows same pattern as test_volume_anomalies.py::test_exclude_detection_from_training

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f4b672b and 94002a8.

📒 Files selected for processing (1)
  • integration_tests/tests/test_freshness_anomalies.py (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.
📚 Learning: 2025-10-23T11:01:31.528Z
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

Applied to files:

  • integration_tests/tests/test_freshness_anomalies.py
🧬 Code graph analysis (1)
integration_tests/tests/test_freshness_anomalies.py (3)
integration_tests/tests/test_volume_anomalies.py (1)
  • test_exclude_detection_from_training (545-621)
integration_tests/tests/dbt_project.py (1)
  • DbtProject (46-334)
integration_tests/tests/data_generator.py (1)
  • generate_dates (6-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: test (latest_official, clickhouse) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (fusion, bigquery) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (fusion, databricks_catalog) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (fusion, redshift) / test
  • GitHub Check: test (fusion, snowflake) / test
  • GitHub Check: test (latest_official, redshift) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, postgres) / test
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (latest_official, athena) / test
🔇 Additional comments (1)
integration_tests/tests/test_freshness_anomalies.py (1)

238-311: Verify the test behaves as intended after fixing the data ranges.

Once the data generation is corrected, confirm:

  1. The anomalous update frequency (8-hour intervals vs. 2-hour baseline) is detected when exclusion is enabled
  2. The sensitivity: 5 value is appropriate for this scenario
  3. The test produces the expected pass/fail outcomes

After applying the data-range fix, run this test locally or in CI to ensure it passes and correctly exercises the exclude_detection_period_from_training flag.

- Reduced training period from 30 to 14 days to amplify effect
- Changed anomalous data from every 8 hours to every 24 hours (only 1 update per day at midnight)
- This creates freshness of ~24 hours vs ~2 hours in normal data
- Reduced sensitivity from 5 to 3 (less strict)
- Aligned time to midnight and added custom_run_started_at for consistent bucket boundaries across warehouses
- This should make the anomaly detectable when excluded from training

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
integration_tests/tests/test_freshness_anomalies.py (1)

251-269: Critical: Data generation still produces wrong time ranges.

This issue was already flagged in a previous review but hasn't been fixed. The generate_dates function generates dates backward from base_date for days_back days. With the current parameters:

  • Normal data (lines 252-259): generates from day -21 back to day -35 (utc_now - 21 days minus 14 days)
  • Anomalous data (lines 262-269): generates from day -7 back to day -14 (utc_now - 7 days minus 7 days)

These ranges don't align with the test's intent. For a 14-day training period and 7-day detection period, you need:

  • Detection period (most recent 7 days): days 0 to -7
  • Training period (14 days before detection): days -7 to -21

Apply this diff to fix the date ranges:

-    # Generate 14 days of normal data with frequent updates (every 2 hours)
+    # Generate 14 days of normal data (training period: days -21 to -7)
     normal_data = [
         {TIMESTAMP_COLUMN: date.strftime(DATE_FORMAT)}
         for date in generate_dates(
-            base_date=utc_now - timedelta(days=21),
+            base_date=utc_now - timedelta(days=7),
             step=timedelta(hours=2),
             days_back=14,
         )
     ]
 
-    # Generate 7 days of anomalous data (only 1 update per day at midnight)
+    # Generate 7 days of anomalous data (detection period: days 0 to -7)
     anomalous_data = [
         {TIMESTAMP_COLUMN: date.strftime(DATE_FORMAT)}
         for date in generate_dates(
-            base_date=utc_now - timedelta(days=7),
+            base_date=utc_now,
             step=timedelta(hours=24),
             days_back=7,
         )
     ]
🧹 Nitpick comments (1)
integration_tests/tests/test_freshness_anomalies.py (1)

249-249: Use datetime.now(timezone.utc) instead of deprecated utcnow().

The datetime.utcnow() method is deprecated in Python 3.12+ and will be removed in future versions.

Apply this diff:

-    utc_now = datetime.utcnow().replace(hour=0, minute=0, second=0, microsecond=0)
+    from datetime import timezone
+    utc_now = datetime.now(timezone.utc).replace(hour=0, minute=0, second=0, microsecond=0)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 94002a8 and 7576f81.

📒 Files selected for processing (1)
  • integration_tests/tests/test_freshness_anomalies.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-23T11:01:31.528Z
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

Applied to files:

  • integration_tests/tests/test_freshness_anomalies.py
🧬 Code graph analysis (1)
integration_tests/tests/test_freshness_anomalies.py (2)
integration_tests/tests/dbt_project.py (1)
  • DbtProject (46-334)
integration_tests/tests/data_generator.py (1)
  • generate_dates (6-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: test (fusion, redshift) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, redshift) / test
  • GitHub Check: test (latest_official, athena) / test
  • GitHub Check: test (fusion, databricks_catalog) / test
  • GitHub Check: test (fusion, snowflake) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (latest_official, clickhouse) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (fusion, bigquery) / test
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (latest_official, postgres) / test
🔇 Additional comments (2)
integration_tests/tests/test_freshness_anomalies.py (2)

238-248: LGTM: Clear test structure and documentation.

The test function is properly decorated to skip ClickHouse targets, and the docstring clearly explains the test scenario and expected outcomes.


273-313: Parameter verification successful — test configuration is correct.

The training_period and detection_period parameters are confirmed as valid parameters in the freshness_anomalies test macro signature. The test correctly uses these parameters and the test logic is sound:

  • Without exclusion: Anomalous pattern (1 update/day) included in training becomes baseline normal → test passes ✓
  • With exclusion: Anomalous pattern excluded from training, baseline expects frequent updates (12/day) → test fails when observing 1 update/day ✓

The test properly isolates scenarios using distinct test IDs and assertions.

The integration test harness wasn't passing exclude_detection_period_from_training
through to the test YAML for freshness tests. This fix passes the parameter via
test_vars instead, which works with the config precedence system (test > model > vars).

This ensures the flag is properly applied during the 'with_exclusion' test run.

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7576f81 and d5e8482.

📒 Files selected for processing (1)
  • integration_tests/tests/test_freshness_anomalies.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-23T11:01:31.528Z
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

Applied to files:

  • integration_tests/tests/test_freshness_anomalies.py
🧬 Code graph analysis (1)
integration_tests/tests/test_freshness_anomalies.py (3)
integration_tests/tests/test_volume_anomalies.py (1)
  • test_exclude_detection_from_training (545-621)
integration_tests/tests/dbt_project.py (1)
  • DbtProject (46-334)
integration_tests/tests/data_generator.py (1)
  • generate_dates (6-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: test (fusion, redshift) / test
  • GitHub Check: test (latest_official, clickhouse) / test
  • GitHub Check: test (fusion, snowflake) / test
  • GitHub Check: test (fusion, bigquery) / test
  • GitHub Check: test (latest_official, redshift) / test
  • GitHub Check: test (latest_official, postgres) / test
  • GitHub Check: test (fusion, databricks_catalog) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (latest_official, athena) / test
🔇 Additional comments (1)
integration_tests/tests/test_freshness_anomalies.py (1)

238-311: Good test structure, but verify after fixing date ranges.

The test logic and structure are sound:

  • Clear two-scenario approach (with/without exclusion)
  • Proper use of test_vars to toggle the flag
  • Descriptive assertions with failure messages
  • Appropriate difference in update frequencies (2h vs 24h) to create detectable anomaly

Once the date range issue is fixed, ensure the test produces the expected behavior (pass without exclusion, fail with exclusion).

devin-ai-integration bot and others added 4 commits November 10, 2025 14:32
Fixed the integration test to pass exclude_detection_period_from_training via
test_args (matching the volume anomalies pattern) instead of test_vars.

The parameter needs to be in test_args so it gets included in the test YAML
and can be retrieved by get_test_argument() in the macro.

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
- Lower sensitivity from 3 to 2 to ensure detection
- Add min_training_set_size: 5 to ensure enough training data after exclusion
- Add anomaly_direction: spike to explicitly detect higher freshness as anomalous
- Remove custom_run_started_at to avoid cross-adapter boundary issues

Investigation confirmed that exclude_detection_period_from_training parameter
is reaching the macros correctly (verified in CI logs). The issue was that
the test scenario needed proper configuration to detect the anomaly.

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
- Add days_back: 30 to cover entire training + detection span
- Add ignore_small_changes thresholds set to 0 to remove percent-based gating
- Re-add custom_run_started_at with proper alignment to midnight (detection_end)
- Lower sensitivity to 1 for with_exclusion test to force detection and validate wiring

These changes address potential issues with:
- Missing training buckets due to insufficient days_back
- Percent-based thresholds preventing anomaly detection
- Cross-adapter time boundary alignment issues
- Sensitivity threshold preventing detection of the anomaly

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
Key fixes based on investigation:
- Add backfill_days: 7 to align exclusion window with detection period (critical for exclude logic)
- Weaken anomaly from 24h to 12h freshness (noon updates instead of midnight)
- Keep same sensitivity (2) for both runs to satisfy 'same values' requirement
- Remove different sensitivity for with_exclusion run

This should result in:
- Without exclusion: PASS (anomaly diluted into training; z ≈ 1.4 < 2)
- With exclusion: FAIL (anomaly stands out; z >> 2)

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
integration_tests/tests/test_freshness_anomalies.py (1)

252-269: Critical: Date ranges are misaligned with test configuration.

The past review comment correctly identifies that these date ranges produce data in the wrong time windows:

  • Normal data (lines 252-259): generates days -21 to -35
  • Anomalous data (lines 262-269): generates days -7 to -14

With detection_end = utc_now + timedelta(days=1) and the configured training/detection periods, the test will look for data in days +1 to -6 (detection) and days -6 to -20 (training), but the actual data is in days -7 to -35. This misalignment means the detection period contains no data, and the test cannot work as intended.

Apply the fix suggested in the past review:

     # Generate 14 days of normal data with frequent updates (every 2 hours)
     normal_data = [
         {TIMESTAMP_COLUMN: date.strftime(DATE_FORMAT)}
         for date in generate_dates(
-            base_date=utc_now - timedelta(days=21),
+            base_date=utc_now - timedelta(days=7),
             step=timedelta(hours=2),
             days_back=14,
         )
     ]
 
-    # Generate 7 days of anomalous data (only 1 update per day at noon)
+    # Generate 7 days of anomalous data (only 1 update per day)
     anomalous_data = [
         {TIMESTAMP_COLUMN: date.strftime(DATE_FORMAT)}
         for date in generate_dates(
-            base_date=(utc_now - timedelta(days=7)).replace(hour=12, minute=0),
+            base_date=utc_now.replace(hour=12, minute=0),
             step=timedelta(hours=24),
             days_back=7,
         )
     ]

This positions:

  • Normal data (training): days -7 to -21
  • Anomalous data (detection): days 0 to -7
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 14e1a19 and 2558e32.

📒 Files selected for processing (1)
  • integration_tests/tests/test_freshness_anomalies.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-23T11:01:31.528Z
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

Applied to files:

  • integration_tests/tests/test_freshness_anomalies.py
🧬 Code graph analysis (1)
integration_tests/tests/test_freshness_anomalies.py (2)
integration_tests/tests/dbt_project.py (1)
  • DbtProject (46-334)
integration_tests/tests/data_generator.py (1)
  • generate_dates (6-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: test (latest_official, athena) / test
  • GitHub Check: test (latest_official, clickhouse) / test
  • GitHub Check: test (fusion, bigquery) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (fusion, databricks_catalog) / test
  • GitHub Check: test (latest_official, postgres) / test
  • GitHub Check: test (fusion, snowflake) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (fusion, redshift) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (latest_official, redshift) / test
🔇 Additional comments (2)
integration_tests/tests/test_freshness_anomalies.py (2)

238-248: Test structure and documentation look good.

The test function is properly decorated and documented. The scenario description clearly explains the intended behavior: anomalous data should be missed when included in training (pass) but detected when excluded from training (fail).


292-322: Test invocations and assertions are well-structured.

Both test invocations properly:

  • Use unique test IDs to avoid collisions
  • Pass test configuration and data correctly
  • Set custom_run_started_at via test_vars
  • Assert expected outcomes with explanatory messages

The logic is sound: without exclusion, the anomaly is included in training and should be normalized (pass); with exclusion, only clean training data is used, so the anomaly should be detected (fail).

After applying the date range fixes, please verify:

  1. With exclude_detection_period_from_training=False, the test passes (anomaly missed because included in baseline)
  2. With exclude_detection_period_from_training=True, the test fails (anomaly detected against clean baseline)
  3. The sensitivity setting (2) appropriately distinguishes between the two scenarios

The test's success depends on the anomalous data being significant enough to dilute the baseline when included, but not so strong that it's detected even when diluted. If the test is flaky, consider adjusting sensitivity or the contrast between normal (2-hour intervals) and anomalous (24-hour intervals) data.

devin-ai-integration bot and others added 2 commits November 10, 2025 15:48
- Changed training period from 14 to 30 days to overlap with 7-day detection period
- Changed normal data from 14 to 30 days (days 37-7 ago)
- Changed days_back from 30 to 40 to cover entire data range
- Changed sensitivity from 2 to 3 to allow anomaly absorption when included in training
- Set backfill_days to 0 to avoid independent exclusion of recent data
- This matches the pattern used in test_volume_anomalies.py::test_exclude_detection_from_training

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
- Change detection_end from utc_now + 1 day to utc_now
- Adjust normal data generation from days -67 to -37 → days -37 to -7
- Adjust anomalous data generation from days -14 to -7 → days -7 to -1
- Update test documentation to reflect the new timeline

This makes the test more intuitive by ensuring anomalous events fall
directly within the detection period (days -7 to -1) rather than
relying on the propagation effect of freshness metrics.

Detection period now clearly covers days -7 to -1, and anomalous
events occur at noon on those same days, making the test behavior
more transparent and maintainable.

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
integration_tests/tests/test_freshness_anomalies.py (1)

238-324: Well-structured test for validating the exclusion flag.

The test correctly validates the exclude_detection_period_from_training flag by demonstrating that:

  • Without exclusion: anomalous data becomes part of the training baseline → test passes (anomaly missed)
  • With exclusion: anomalous data excluded from training → test fails (anomaly detected)

The data generation creates a clear 12× difference in update frequency (2-hour vs 24-hour intervals), which should be reliably detectable.

Minor suggestion: Consider clarifying the comments (lines 244-249) to be more precise:

  • Line 245: Note that anomalous data points occur at noon (12:00) each day
  • Line 246: Detection period is days -7 to 0 (though data exists only to day -1 at 12:00)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between eb7d179 and 38b64db.

📒 Files selected for processing (1)
  • integration_tests/tests/test_freshness_anomalies.py (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.
📚 Learning: 2025-10-23T11:01:31.528Z
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

Applied to files:

  • integration_tests/tests/test_freshness_anomalies.py
🧬 Code graph analysis (1)
integration_tests/tests/test_freshness_anomalies.py (2)
integration_tests/tests/dbt_project.py (1)
  • DbtProject (46-334)
integration_tests/tests/data_generator.py (1)
  • generate_dates (6-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (fusion, databricks_catalog) / test
  • GitHub Check: test (latest_official, postgres) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (fusion, snowflake) / test
  • GitHub Check: test (latest_official, athena) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (latest_official, clickhouse) / test
  • GitHub Check: test (fusion, bigquery) / test
  • GitHub Check: test (fusion, redshift) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (latest_official, redshift) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, databricks_catalog) / check-if-requires-approval

The test was failing because with a 30-day training period, the first
detection bucket was evaluated against a baseline dominated by normal
data, causing it to fail even without exclusion.

Changes:
- Reduce training_period from 30 days to 7 days
- Move normal data to days -14 to -8 (before detection period)
- Keep anomalous data at days -7 to -1 (in detection period)
- Reduce days_back from 40 to 20

This makes the pass/fail behavior hinge purely on the exclude flag:
- Without exclusion: training = detection window (anomalous) → PASS
- With exclusion: training = days -14 to -8 (normal) → FAIL

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
integration_tests/tests/test_freshness_anomalies.py (1)

240-250: Clarify docstring to distinguish time windows from data coverage.

The docstring describes date ranges for data coverage but could be misinterpreted as the exact period boundaries. The detection period technically extends to day 0 (since custom_run_started_at = utc_now), and the training period (when excluded) is days -14 to -7, not days -14 to -8.

Consider updating the docstring to be more precise:

     """
     Test the exclude_detection_period_from_training flag functionality for freshness anomalies.
 
     Scenario:
-    - 7 days of normal data with frequent updates (every 2 hours) from day -14 to day -8
-    - 7 days of anomalous data (only 1 update per day at noon) from day -7 to day -1
-    - Detection period: last 7 days (days -7 to -1)
+    - 7 days of normal data with frequent updates (every 2 hours) covering days -15 to -8
+    - 7 days of anomalous data (only 1 update per day at noon) covering days -7 to -1
+    - Detection period: 7 days ending at day 0 [day -7, day 0)
     - Training period: 7 days
-    - Without exclusion: training = detection window (anomalous pattern) → test passes
-    - With exclusion: training = days -14 to -8 (normal pattern) → test fails (detects anomaly)
+    - Without exclusion: training window = detection window (includes anomalous pattern) → test passes
+    - With exclusion: training window = [day -14, day -7) (uses normal pattern) → test fails (detects anomaly)
     """
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 38b64db and dc423c0.

📒 Files selected for processing (1)
  • integration_tests/tests/test_freshness_anomalies.py (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.
📚 Learning: 2025-10-23T11:01:31.528Z
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

Applied to files:

  • integration_tests/tests/test_freshness_anomalies.py
🧬 Code graph analysis (1)
integration_tests/tests/test_freshness_anomalies.py (3)
integration_tests/tests/conftest.py (2)
  • test_id (180-183)
  • dbt_project (144-147)
integration_tests/tests/dbt_project.py (1)
  • DbtProject (46-334)
integration_tests/tests/data_generator.py (1)
  • generate_dates (6-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: test (latest_official, redshift) / test
  • GitHub Check: test (fusion, bigquery) / test
  • GitHub Check: test (latest_official, postgres) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (latest_official, athena) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (fusion, redshift) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (fusion, snowflake) / test
  • GitHub Check: test (fusion, databricks_catalog) / test
  • GitHub Check: test (latest_official, clickhouse) / test
  • GitHub Check: test (latest_official, dremio) / test
🔇 Additional comments (3)
integration_tests/tests/test_freshness_anomalies.py (3)

253-273: Data generation logic looks correct.

The data ranges are well-designed for testing the exclusion flag:

  • Normal data (days -15 to -8, 12 updates/day) provides a baseline when exclusion is enabled
  • Anomalous data (days -7 to -1, 1 update/day) falls within the detection period
  • The two datasets don't overlap, ensuring clean separation between training and detection windows

276-292: Test configuration parameters are appropriate.

The configuration correctly sets up the scenario:

  • days_back=20 adequately covers the generated data (~15 days)
  • 7-day training and detection periods align with the data structure
  • sensitivity=3 provides a reasonable threshold for detecting the ~12x drop in update frequency
  • anomaly_direction="spike" correctly identifies increases in time between updates (i.e., less frequent updates)

294-324: Test assertions and logic are well-structured.

The test effectively validates the exclude_detection_period_from_training flag through a clear two-phase approach:

  1. Without exclusion (lines 294-305): Training includes the anomalous pattern, so the baseline is "1 update/day" → detection sees "1 update/day" → no anomaly detected (PASS) ✓
  2. With exclusion (lines 307-324): Training uses only normal data, so the baseline is "~12 updates/day" → detection sees "1 update/day" → significant drop detected (FAIL) ✓

The test IDs are properly isolated, and the assertions match the expected behavior described in the PR objectives.

devin-ai-integration bot and others added 2 commits November 12, 2025 13:46
Changes:
- Change detection_end from utc_now + 1 day to utc_now
- Move normal data to days -14 to -8 (7 days with 2-hour updates)
- Move anomalous data to days -7 to -1 (7 days with 1 update/day at noon)
- Change training_period from 30 days to 14 days
- Reduce min_training_set_size from 5 to 3
- Update days_back from 40 to 20

Why This Works:
- training_period = 14 days ensures there are training buckets available when
  exclusion is enabled (days -14 to -8)
- Without exclusion: Training includes both normal and anomalous data, so the
  anomalous pattern becomes part of the baseline → test PASSES
- With exclusion: Training includes only normal data (days -14 to -8), so the
  anomalous pattern in detection (days -7 to -1) stands out → test FAILS

This makes the test more intuitive by aligning anomalous data directly with
the detection period while ensuring both test cases work correctly.

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
integration_tests/tests/test_freshness_anomalies.py (1)

267-274: Documentation inconsistency: normal data range doesn't match comment.

The comment states normal data spans "from day -14 to day -8" (lines 243-244, 252), but with base_date=utc_now - timedelta(days=8) and days_back=7, the generate_dates function produces data from approximately day -15 to day -8. This is because min_date = day -8 - 7 days = day -15.

While this doesn't break the test logic (the training period with exclusion still captures 7 days of normal data from days -14 to -8), it creates a maintenance hazard.

Apply this diff to align the code with the documentation:

     # Generate 7 days of normal data with frequent updates (every 2 hours) from day -14 to day -8
     normal_data = [
         {TIMESTAMP_COLUMN: date.strftime(DATE_FORMAT)}
         for date in generate_dates(
             base_date=utc_now - timedelta(days=8),
             step=timedelta(hours=2),
-            days_back=7,
+            days_back=6,
         )
     ]

Alternatively, update the comments to reflect the actual range: "from day -15 to day -8 (7 days duration)".

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5cd80c5 and 294b6ca.

📒 Files selected for processing (1)
  • integration_tests/tests/test_freshness_anomalies.py (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.
📚 Learning: 2025-10-23T11:01:31.528Z
Learnt from: arbiv
Repo: elementary-data/dbt-data-reliability PR: 877
File: macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql:34-38
Timestamp: 2025-10-23T11:01:31.528Z
Learning: In the dbt-data-reliability project, `backfill_days` is a mandatory parameter for anomaly detection tests (volume_anomalies, freshness_anomalies, etc.). It is validated by the `validate_mandatory_configuration` macro in `get_anomalies_test_configuration.sql` and will cause a compiler error if not provided. Therefore, it's always guaranteed to be set when the anomaly detection logic runs.

Applied to files:

  • integration_tests/tests/test_freshness_anomalies.py
🧬 Code graph analysis (1)
integration_tests/tests/test_freshness_anomalies.py (2)
integration_tests/tests/dbt_project.py (1)
  • DbtProject (46-334)
integration_tests/tests/data_generator.py (1)
  • generate_dates (6-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: test (fusion, bigquery) / test
  • GitHub Check: test (fusion, redshift) / test
  • GitHub Check: test (latest_pre, postgres) / test
  • GitHub Check: test (latest_official, snowflake) / test
  • GitHub Check: test (latest_official, databricks_catalog) / test
  • GitHub Check: test (latest_official, bigquery) / test
  • GitHub Check: test (fusion, snowflake) / test
  • GitHub Check: test (latest_official, dremio) / test
  • GitHub Check: test (fusion, databricks_catalog) / test
  • GitHub Check: test (latest_official, postgres) / test
  • GitHub Check: test (latest_official, trino) / test
  • GitHub Check: test (latest_official, athena) / test
  • GitHub Check: test (latest_official, redshift) / test
🔇 Additional comments (1)
integration_tests/tests/test_freshness_anomalies.py (1)

238-338: Well-structured integration test.

The test effectively validates the exclude_detection_period_from_training flag by constructing a scenario where:

  • Training without exclusion normalizes the anomalous pattern (sparse updates) into the baseline → test passes
  • Training with exclusion excludes the anomalous pattern from the baseline → anomaly is detected → test fails

The configuration is sound, assertions are clear, and the test is appropriately skipped on ClickHouse.

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>
@arbiv arbiv merged commit 60e2c7c into master Nov 17, 2025
32 checks passed
@arbiv arbiv deleted the devin/CORE-81-1762781203 branch November 17, 2025 06:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants