Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,10 @@
when training_stddev = 0 then 0 -- Stationary data case - valid, all values are identical
else (metric_value - training_avg) / (training_stddev)
end as anomaly_score,
case
when training_stddev is not null and training_set_size > 1 and training_stddev = 0 then TRUE
else FALSE
end as is_zscore_fallback,
Comment on lines +234 to +237
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's understand the project structure and find dbt configuration
find . -name "dbt_project.yml" -o -name "profiles.yml" | head -5

Repository: elementary-data/dbt-data-reliability

Length of output: 198


🏁 Script executed:

# Check for boolean literal patterns in SQL files
rg -t sql '\b(TRUE|FALSE|true|false)\b' --no-heading -c | sort | uniq -c | sort -rn | head -20

Repository: elementary-data/dbt-data-reliability

Length of output: 1310


🏁 Script executed:

# Look at the specific file context around lines 234-237
fd get_anomaly_scores_query.sql -x cat -n {} | sed -n '220,250p'

Repository: elementary-data/dbt-data-reliability

Length of output: 2197


🏁 Script executed:

# Search for how other SQL files handle boolean values
rg -t sql 'CASE.*WHEN.*THEN\s+(TRUE|FALSE|true|false|1|0)' --no-heading | head -10

Repository: elementary-data/dbt-data-reliability

Length of output: 62


🏁 Script executed:

# Check dbt_project.yml for database configuration
cat -n dbt_project.yml | head -50

Repository: elementary-data/dbt-data-reliability

Length of output: 807


🏁 Script executed:

# Search for boolean patterns more broadly in macros
rg -t sql --no-heading 'end as.*boolean|end as.*is_' macros/ | head -20

Repository: elementary-data/dbt-data-reliability

Length of output: 396


🏁 Script executed:

# Look for type casting patterns in the codebase
rg -t sql 'cast.*boolean|boolean.*cast' -i | head -10

Repository: elementary-data/dbt-data-reliability

Length of output: 337


🏁 Script executed:

# Check what database adapters or packages are used
rg -t yaml 'packages:|version:|databases:' . | head -20

Repository: elementary-data/dbt-data-reliability

Length of output: 1134


🏁 Script executed:

# Look for similar boolean-returning CASE statements in macros
rg -t sql -A2 'then (TRUE|FALSE|true|false|1|0)\s*$' macros/ | head -30

Repository: elementary-data/dbt-data-reliability

Length of output: 2496


🏁 Script executed:

# Find and examine the edr_type_bool() macro
fd edr_type_bool -o find . -name "*edr_type*" -o -name "*type_bool*" | xargs grep -l "edr_type_bool" 2>/dev/null | head -5

Repository: elementary-data/dbt-data-reliability

Length of output: 207


🏁 Script executed:

# Search for the macro definition
rg -t sql 'macro.*edr_type_bool|edr_type_bool\(' macros/ -B2 -A5

Repository: elementary-data/dbt-data-reliability

Length of output: 3451


🏁 Script executed:

# Check the full context of get_anomaly_query.sql which also uses TRUE/FALSE
fd get_anomaly_query.sql -x cat -n {} | sed -n '1,50p'

Repository: elementary-data/dbt-data-reliability

Length of output: 2251


🏁 Script executed:

# Look for database-specific boolean handling patterns in macros
rg -t sql 'target\.type|adapter\.|dbt\.config' macros/ | head -15

Repository: elementary-data/dbt-data-reliability

Length of output: 2191


🏁 Script executed:

# Check if there's a data_types directory with boolean type definitions
fd -t f 'data_type' macros/ | xargs cat -n | head -50

Repository: elementary-data/dbt-data-reliability

Length of output: 1995


🏁 Script executed:

# Check if there are database-specific implementations for boolean literals
rg -t sql 'postgres.*TRUE|snowflake.*TRUE|bigquery.*TRUE' . | head -10

Repository: elementary-data/dbt-data-reliability

Length of output: 62


🏁 Script executed:

# Look for more context in how is_anomalous is used (it uses TRUE/FALSE too)
fd get_anomaly_query.sql -x cat -n {} | grep -A20 "is_anomalous = true"

Repository: elementary-data/dbt-data-reliability

Length of output: 1486


🏁 Script executed:

# Check test files to see what databases are tested
fd docker-compose -x cat | grep -i 'postgres\|bigquery\|snowflake\|redshift' | head -20

Repository: elementary-data/dbt-data-reliability

Length of output: 507


🏁 Script executed:

# Check if packages.yml gives us hints about supported databases
cat packages.yml

Repository: elementary-data/dbt-data-reliability

Length of output: 156


🏁 Script executed:

# Look for adapter-specific boolean literal handling
rg -t sql 'true|false|TRUE|FALSE' macros/edr/data_monitoring/schema_changes/get_columns_snapshot_query.sql -B2 -A2

Repository: elementary-data/dbt-data-reliability

Length of output: 345


🏁 Script executed:

# Check if there's a pattern of uppercase vs lowercase boolean literals in boolean-returning cases
rg -t sql -B3 'end as is_|end as has_' macros/ | grep -E '(TRUE|FALSE|true|false)' | head -30

Repository: elementary-data/dbt-data-reliability

Length of output: 671


🏁 Script executed:

# Let's examine the broader context of the file to see if there's a style preference
cat -n macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql | sed -n '200,250p'

Repository: elementary-data/dbt-data-reliability

Length of output: 3955


🏁 Script executed:

# Check integration tests to see what happens with these boolean values
rg -t sql 'is_zscore_fallback' . | head -10

Repository: elementary-data/dbt-data-reliability

Length of output: 602


Use lowercase boolean literals for consistency with codebase standards.

The code uses TRUE and FALSE (uppercase), but other boolean-returning CASE statements in the codebase use lowercase true and false (e.g., get_columns_snapshot_query.sql). Change lines 235-236 to use lowercase true and false to match the established pattern and align with standard SQL conventions.

🤖 Prompt for AI Agents
In macros/edr/data_monitoring/anomaly_detection/get_anomaly_scores_query.sql
around lines 234 to 237, the CASE returns uppercase boolean literals TRUE and
FALSE; change them to lowercase true and false to match the codebase convention
and SQL style (i.e., replace TRUE with true and FALSE with false in that CASE
expression).

{{ test_configuration.anomaly_sensitivity }} as anomaly_score_threshold,
source_value as anomalous_value,
{{ elementary.edr_cast_as_timestamp('bucket_start') }} as bucket_start,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,21 @@
{% endif %}
{% endset %}
{% set failures = namespace(data=0) %}
{% set zscore_fallback_passed = namespace(data=0) %}
{% set zscore_fallback_failed = namespace(data=0) %}
{% set filtered_anomaly_scores_rows = [] %}
{% for row in anomaly_scores_rows %}
{% if row.anomaly_score is not none %}
{% do filtered_anomaly_scores_rows.append(row) %}
{% if row.is_anomalous %}
{% set failures.data = failures.data + 1 %}
{% if elementary.insensitive_get_dict_value(row, 'is_zscore_fallback') %}
{% set zscore_fallback_failed.data = zscore_fallback_failed.data + 1 %}
{% endif %}
{% else %}
{% if elementary.insensitive_get_dict_value(row, 'is_zscore_fallback') %}
{% set zscore_fallback_passed.data = zscore_fallback_passed.data + 1 %}
{% endif %}
{% endif %}
{% endif %}
{% endfor %}
Expand All @@ -69,7 +78,9 @@
'test_results_query': test_results_query,
'test_params': test_params,
'result_rows': filtered_anomaly_scores_rows,
'failures': failures.data
'failures': failures.data,
'zscore_fallback_passed_count': zscore_fallback_passed.data,
'zscore_fallback_failed_count': zscore_fallback_failed.data
} %}
{% set elementary_test_row = elementary.get_dbt_test_result_row(flattened_test) %}
{% do elementary_test_row.update(test_result_dict) %}
Expand Down
4 changes: 3 additions & 1 deletion macros/edr/system/system_utils/empty_table.sql
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@
('test_short_name', 'string'),
('test_alias', 'string'),
('result_rows', 'long_string'),
('failed_row_count', 'bigint')
('failed_row_count', 'bigint'),
('zscore_fallback_passed_count', 'bigint'),
('zscore_fallback_failed_count', 'bigint')
]) }}
{% endmacro %}

Expand Down
1 change: 1 addition & 0 deletions macros/edr/tests/test_utils/get_anomaly_query.sql
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
column_name,
metric_name,
anomaly_score,
is_zscore_fallback,
anomaly_score_threshold,
anomalous_value,
bucket_start,
Expand Down
Loading