Fix `exclude_detection_period_from_training` for large time buckets by devin-ai-integration[bot] · Pull Request #925 · elementary-data/dbt-data-reliability

devin-ai-integration · 2026-02-10T09:51:21Z

Fix `exclude_detection_period_from_training` for large time buckets

Summary

exclude_detection_period_from_training was silently ineffective when the time bucket period exceeded backfill_days. For example, with weekly buckets (time_bucket: {period: week, count: 1}) and the default backfill_days: 2, the exclusion window was only 2 days wide. The most recent weekly bucket's bucket_end would fall outside this narrow window, so it was never excluded from training—completely defeating the feature.

The fix bumps backfill_days to max(backfill_days, time_bucket_in_days) at config load time in get_anomalies_test_configuration.sql. This ensures backfill_days is always at least one full time bucket, which fixes all three downstream uses:

Exclusion window (get_anomaly_scores_query.sql): detection period now spans at least one full bucket
is_anomalous check (get_anomaly_query.sql): recent buckets are no longer missed by the backfill window filter
Metric recomputation window (get_buckets_configuration.sql): incremental runs now refresh at least one full bucket of metrics

The change is backwards-compatible: when backfill_days already exceeds the bucket period (the common case with daily buckets), behavior is unchanged. Uses the existing elementary.convert_period() utility. An info-level log message is emitted when the bump occurs.

Changes

get_anomalies_test_configuration.sql: After backfill_days is loaded (line 41), added logic to bump it to bucket_in_days when the time bucket is larger. Emits edr_log("backfill_days increased from X to Y to match time bucket size.") when this happens.
get_anomaly_scores_query.sql: No changes. The existing exclusion logic (detection_end - backfill_days) now works correctly because backfill_days is already adjusted upstream.
Integration tests: Added test_excl_detect_train_monthly (volume) and test_col_excl_detect_train_monthly (column) using monthly buckets with default backfill_days=2 — the exact bug scenario. Confirmed locally: tests PASS with fix, FAIL without fix. Skipped on ClickHouse, Redshift, and Dremio (see below).

Updates since last revision

Fixed month arithmetic in tests: replaced timedelta(days=31) with (current_month_1st - timedelta(days=1)).replace(day=1) to reliably go back one calendar month
Fixed normal_month_start calculation: used .replace(year=year-1) directly on anomaly_month_start instead of an intermediate subtraction that only went back 1 month instead of 12
Increased anomaly magnitude to ~5x normal (100 rows/day vs ~20 for volume; 50 nulls/day vs ~10 for column) to widen the z-score gap between absorbed vs excluded scenarios
Attempted removing Redshift/Dremio from skip list with the wider anomaly — both still fail on the "without exclusion should PASS" assertion. The issue is not z-score margin width (even 5x produces z≈3.3, well below sensitivity=10); these engines appear to compute anomaly detection statistics fundamentally differently. Re-added both to skip_targets.

Review & Testing Checklist for Human

Broader backfill_days impact: The fix bumps backfill_days globally at config time, not just for exclusion. With weekly buckets, backfill expands from 2→7 days; with monthly, 2→30 days. This means more metrics are recomputed per incremental run and the is_anomalous window widens. Verify this is acceptable.
No user override distinction: If a user explicitly sets backfill_days=2 with weekly buckets, it still gets bumped to 7. There's no way to distinguish default vs explicit. Confirm this is the desired behavior.
Skipped engines (3 of ~10): Tests are skipped on ClickHouse, Redshift, and Dremio. The fix logic itself runs on all engines, but these specific integration tests don't cover those three. Verify coverage on Postgres, Snowflake, BigQuery, Databricks, Trino, Athena is sufficient.
Suggested manual test: Configure a dbt test with exclude_detection_period_from_training: true, time_bucket: {period: week, count: 1} (or monthly), and no detection_period against a table with a known recent anomaly. Verify the anomaly is now detected and the log message appears.

Notes

Linear issue: CORE-312
Requested by: @arbiv
Link to Devin run

When backfill_days is smaller than the time bucket period (e.g., backfill_days=2 with weekly buckets), the detection period window was too narrow to contain any bucket_end, making the exclusion ineffective. Extend the exclusion window to at least one full time bucket. Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

linear · 2026-02-10T09:51:24Z

CORE-312 `exclude_detection_period_from_training` seems not to work for Column Anomalies

devin-ai-integration · 2026-02-10T09:51:26Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

github-actions · 2026-02-10T09:51:32Z

👋 @devin-ai-integration[bot]
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

coderabbitai · 2026-02-10T09:51:40Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

The anomaly detection SQL now uses an adjustable exclusion_period_days (raised to at least one time bucket when needed) to compute detection-period start when exclude_detection_period_from_training is enabled. Integration tests were added to validate monthly time_bucket behavior for both column and volume anomaly suites.

Changes

Cohort / File(s)	Summary
Anomaly Detection Query Logic `macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql`	Replace fixed use of `test_configuration.backfill_days` with computed `exclusion_period_days`; if `metric_properties.time_bucket` is defined, compute bucket length in days and elevate `exclusion_period_days` to at least one bucket; compute detection start as `detection_end - exclusion_period_days`. Added explanatory comments.
Integration Tests — Column Anomalies `integration_tests/tests/test_column_anomalies.py`	Add `test_col_excl_detect_train_monthly` to verify `exclude_detection_period_from_training` with a monthly `time_bucket` (12 normal months + 1 anomalous month). Test asserts anomaly included in training when exclusion disabled and excluded when enabled; skips ClickHouse.
Integration Tests — Volume Anomalies `integration_tests/tests/test_volume_anomalies.py`	Add `test_excl_detect_train_monthly` mirroring the column test for monthly buckets (12 normal + 1 anomalous month). Runs two dbt tests (exclude off/on) and asserts pass/fail expectations; skips ClickHouse.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hop through months and nibble days away,
I pad the fence so buckets never fray,
One full bucket kept, I guard the training line,
Soft paws, sharp logic — every timestamp fine,
🥕

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically identifies the main change: fixing exclude_detection_period_from_training behavior for large time buckets, which matches the primary objective.
Linked Issues check	✅ Passed	The PR addresses CORE-312 by fixing the logic where exclude_detection_period_from_training was ineffective with large time buckets, ensuring the detection period bucket is excluded from training.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to fixing exclude_detection_period_from_training for large time buckets: the core SQL logic, integration tests validating the fix, and no unrelated modifications.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch devin/core-312-1770716967

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

…ifferences Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

arbiv · 2026-02-10T14:34:12Z

add a comment explaining why we need to skip the test for dremio

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

devin-ai-integration · 2026-02-10T14:36:37Z

Added a comment above the @pytest.mark.skip_targets decorator in both test files explaining the Dremio skip reason (2b7f0bb).

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

…d, force_metrics_backfill - Removed detection_period from test args so backfill_days stays at default (2), which is the actual bug scenario (backfill_days < bucket_size) - Aligned data to month boundaries to avoid empty/partial buckets inflating stddev - Use 1 anomalous month (not 2) for clean z-score separation - Added force_metrics_backfill=True to compute all metrics despite small backfill_days - Sensitivity set to 4: absorbed anomaly z~3 (pass), excluded anomaly z~34 (fail) - Confirmed: test passes WITH fix, fails WITHOUT fix Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@integration_tests/tests/test_column_anomalies.py`:
- Around line 607-608: Replace the month arithmetic that subtracts 31/365 days
with the safe “subtract one day then replace(day=1)” pattern: compute
anomaly_month_start from current_month_1st by doing current_month_1st minus one
day then .replace(day=1), and compute normal_month_start from
anomaly_month_start by subtracting one day then .replace(day=1) so neither
anomaly_month_start nor normal_month_start can skip calendar months (refer to
the anomaly_month_start, normal_month_start, and current_month_1st variables).

In `@integration_tests/tests/test_volume_anomalies.py`:
- Around line 650-651: anomaly_month_start calculation is unstable because
subtracting 31 days can skip a month; change it to compute the previous month by
taking current_month_1st minus one day then replace(day=1) (use
anomaly_month_start = (current_month_1st - timedelta(days=1)).replace(day=1)).
For normal_month_start, avoid subtracting 365 days (which breaks on leap years);
compute the same month last year by using
anomaly_month_start.replace(year=anomaly_month_start.year - 1) (i.e., set
normal_month_start from anomaly_month_start with year decremented). Use the
existing variable names anomaly_month_start and normal_month_start to locate and
update the code.

coderabbitai · 2026-02-10T16:37:45Z

integration_tests/tests/test_column_anomalies.py

+    anomaly_month_start = (current_month_1st - timedelta(days=31)).replace(day=1)
+    normal_month_start = (anomaly_month_start - timedelta(days=365)).replace(day=1)


⚠️ Potential issue | 🟠 Major

Fix previous-month calculation to avoid skipping months.

current_month_1st - timedelta(days=31) can skip the actual previous month. Use the “minus one day, then replace(day=1)” pattern instead to stay on calendar month boundaries.

🐛 Proposed fix

- anomaly_month_start = (current_month_1st - timedelta(days=31)).replace(day=1) + anomaly_month_start = (current_month_1st - timedelta(days=1)).replace(day=1)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

anomaly_month_start = (current_month_1st - timedelta(days=31)).replace(day=1)

normal_month_start = (anomaly_month_start - timedelta(days=365)).replace(day=1)

anomaly_month_start = (current_month_1st - timedelta(days=1)).replace(day=1)

normal_month_start = (anomaly_month_start - timedelta(days=365)).replace(day=1)

🤖 Prompt for AI Agents

In `@integration_tests/tests/test_column_anomalies.py` around lines 607 - 608, Replace the month arithmetic that subtracts 31/365 days with the safe “subtract one day then replace(day=1)” pattern: compute anomaly_month_start from current_month_1st by doing current_month_1st minus one day then .replace(day=1), and compute normal_month_start from anomaly_month_start by subtracting one day then .replace(day=1) so neither anomaly_month_start nor normal_month_start can skip calendar months (refer to the anomaly_month_start, normal_month_start, and current_month_1st variables).

coderabbitai · 2026-02-10T16:37:45Z

integration_tests/tests/test_volume_anomalies.py

+    anomaly_month_start = (current_month_1st - timedelta(days=31)).replace(day=1)
+    normal_month_start = (anomaly_month_start - timedelta(days=365)).replace(day=1)


⚠️ Potential issue | 🟠 Major

Fix previous-month calculation to avoid skipping months.

Subtracting 31 days from the first of the month can jump back two months for 30‑day months or February, which breaks the intended “previous month” window. Use “minus one day, then replace(day=1)” instead.

🐛 Proposed fix

- anomaly_month_start = (current_month_1st - timedelta(days=31)).replace(day=1) + anomaly_month_start = (current_month_1st - timedelta(days=1)).replace(day=1)

🤖 Prompt for AI Agents

In `@integration_tests/tests/test_volume_anomalies.py` around lines 650 - 651, anomaly_month_start calculation is unstable because subtracting 31 days can skip a month; change it to compute the previous month by taking current_month_1st minus one day then replace(day=1) (use anomaly_month_start = (current_month_1st - timedelta(days=1)).replace(day=1)). For normal_month_start, avoid subtracting 365 days (which breaks on leap years); compute the same month last year by using anomaly_month_start.replace(year=anomaly_month_start.year - 1) (i.e., set normal_month_start from anomaly_month_start with year decremented). Use the existing variable names anomaly_month_start and normal_month_start to locate and update the code.

On Redshift (fusion), the anomaly z-score with training inclusion exceeded sensitivity=4. Bumping to 5 ensures the absorbed anomaly passes across all databases while the excluded anomaly (z>>5) still fails. Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

…for cross-db stability\n\n- Volume: anomaly 40->30 rows/day, sensitivity 10\n- Column: anomaly nulls 25->18/day, sensitivity 10\n- Keeps excluded anomaly clearly failing while absorbed anomaly passes across engines Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

…ackfill_days, bucket_size) Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

…et_anomalies_test_configuration.sql Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

…get_anomaly_scores_query.sql Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

haritamar · 2026-02-10T22:42:02Z

integration_tests/tests/test_column_anomalies.py

    )
+
+
+@pytest.mark.skip_targets(["clickhouse", "redshift", "dremio"])


Can elaborate on why Redshift and Dremio are excluded?
I saw this but it's not clear - Skipped on ClickHouse, Redshift, and Dremio due to cross-database z-score variance

(Clickhouse doesn't surprise me as in general the anomaly tests don't currently work for it)

Added detailed comments above the @pytest.mark.skip_targets decorator in both test files explaining the root cause:

Redshift and Dremio compute slightly different floating-point results for stddev/avg than other engines. With monthly buckets, the z-score margin between "absorbed anomaly passes" and "excluded anomaly fails" is narrow enough that these differences cause the "without exclusion" case to also flag as anomalous, making the test flaky.

See commit 2cc884f.

Is it possible to set other values that will produce a more distinct difference that will not be sensitive to this?

Good call. Increased the anomaly magnitude to ~5x normal (volume: 100 rows/day vs ~20 normal; column: 50 nulls/day vs ~10 normal). This creates a z-score gap of roughly 3 (absorbed) vs 80 (excluded), which should be stable across all engines. Removed Redshift and Dremio from the skip list — only ClickHouse is still skipped (general anomaly test issues).

See commit 2e40357.

Tried increasing the anomaly to 5x normal (100 rows/day vs ~20) — this works on Postgres, Snowflake, BigQuery, Trino, Databricks, but Redshift and Dremio still fail on the "without exclusion should PASS" assertion. The issue isn't z-score margin width — even with a massive spike, when there's 1 outlier in 13 monthly data points, the z-score when absorbed is always ~3.3 (well below sensitivity=10). Redshift and Dremio appear to compute the anomaly detection differently at a more fundamental level.

Re-added both to skip_targets. The fix logic itself runs on all engines — only these specific integration tests are skipped. Coverage remains on Postgres (x2), Snowflake, BigQuery, Databricks, Trino, Athena.

See commit 0c19301.

haritamar · 2026-02-10T22:43:23Z

integration_tests/tests/test_column_anomalies.py

@@ -578,3 +578,112 @@ def test_col_anom_excl_detect_train(test_id: str, dbt_project: DbtProject):
        "Expected FAIL when exclude_detection_period_from_training=True "
        "(detection data excluded from training baseline, anomaly detected)"
    )


Devin - please check CodeRabbit comments

Addressed both CodeRabbit comments in commit 2cc884f:

Fixed anomaly_month_start calculation: (current_month_1st - timedelta(days=31)).replace(day=1) → (current_month_1st - timedelta(days=1)).replace(day=1) to avoid skipping months (e.g. March → January instead of February).

Fixed normal_month_start calculation similarly, using .replace(year=anomaly_month_start.year - 1) to avoid leap year issues with timedelta(days=365).

…or Redshift/Dremio Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

… going back 1 month first Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

…o skips Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

…se engines Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

devin-ai-integration bot and others added 2 commits February 10, 2026 10:29

Add weekly bucket tests for exclude_detection_period_from_training

5a339f6

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

Skip weekly bucket exclusion tests on Dremio due to bucket boundary d…

74a64d9

…ifferences Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

Add comment explaining why Dremio is skipped in weekly bucket tests

2b7f0bb

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

devin-ai-integration bot and others added 2 commits February 10, 2026 14:40

Remove 'The fix itself is not Dremio-specific' from skip comments

d45820a

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

coderabbitai bot reviewed Feb 10, 2026

View reviewed changes

devin-ai-integration bot and others added 7 commits February 10, 2026 16:43

Skip monthly bucket exclusion tests on Redshift due to z-score variance

8a56eff

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

Skip monthly bucket exclusion tests on Dremio due to z-score variance

2535304

Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

Move backfill_days fix to config load time: set backfill_days = max(b…

80a215a

…ackfill_days, bucket_size) Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

Revert get_anomaly_scores_query.sql to original: fix is now only in g…

a6ceb5a

…et_anomalies_test_configuration.sql Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

Add log when backfill_days is increased to match bucket size; revert …

02b5f56

…get_anomaly_scores_query.sql Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

haritamar reviewed Feb 10, 2026

View reviewed changes

devin-ai-integration bot and others added 4 commits February 10, 2026 22:49

Address PR review: fix month arithmetic, add detailed skip comments f…

2cc884f

…or Redshift/Dremio Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

Fix normal_month_start: use .replace(year=year-1) directly instead of…

0d87279

… going back 1 month first Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

Make anomaly 5x normal to widen z-score margin; remove Redshift/Dremi…

2e40357

…o skips Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

Re-add Redshift/Dremio to skip_targets: 5x anomaly still fails on the…

0c19301

…se engines Co-Authored-By: Yosef Arbiv <yosef.arbiv@gmail.com>

devin-ai-integration bot assigned arbiv Feb 12, 2026

		anomaly_month_start = (current_month_1st - timedelta(days=31)).replace(day=1)
		normal_month_start = (anomaly_month_start - timedelta(days=365)).replace(day=1)

		)


		@pytest.mark.skip_targets(["clickhouse", "redshift", "dremio"])

Conversation

devin-ai-integration bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix exclude_detection_period_from_training for large time buckets

Summary

Changes

Updates since last revision

Review & Testing Checklist for Human

Notes

Uh oh!

linear bot commented Feb 10, 2026

Uh oh!

devin-ai-integration bot commented Feb 10, 2026

🤖 Devin AI Engineer

Uh oh!

github-actions bot commented Feb 10, 2026

Uh oh!

coderabbitai bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

arbiv commented Feb 10, 2026

Uh oh!

devin-ai-integration bot commented Feb 10, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

haritamar Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

arbiv Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

haritamar Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

devin-ai-integration bot commented Feb 10, 2026 •

edited

Loading

Fix `exclude_detection_period_from_training` for large time buckets

coderabbitai bot commented Feb 10, 2026 •

edited

Loading