Skip to content

Commit

Permalink
Merge pull request #15 from avohq/minimum-event-count-parameter
Browse files Browse the repository at this point in the history
allow users to set minimum event count to detect anomalies
  • Loading branch information
bjornj12 authored Dec 8, 2021
2 parents ba7dd5e + 72fe47b commit 350dce7
Show file tree
Hide file tree
Showing 4 changed files with 20 additions and 9 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/integration_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ defaults:
on:
# Triggers the workflow on push or pull request events but only for the master branch
pull_request:
branches: [ master ]
branches: [ main ]


# will cancel previous workflows triggered by the same event and for the same ref for PRs or same SHA otherwise
Expand Down
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
# Unreleased
Features and bug fixes that have been applied but not released yet.

# avo-audit v1.0.1

## Fixes and polish
- mininum_avg_event_volume parameter
Allows the user to set the mininum avarage for the algorithm to try to detect anomalies, as it can be extremely hard to detect with very low volume events.

# avo-audit v1.0.0

Expand Down
2 changes: 1 addition & 1 deletion integration_tests/models/experiment_test_data_normal.sql
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@
{%- set event_source_column = 'client' -%}

{{
avo_audit.test_detect_event_anomaly(ref('avo_audit_normal_data'), event_name_column, event_date_column, event_source_column, "2021-12-01", n_days)
avo_audit.test_detect_event_anomaly(ref('avo_audit_normal_data'), event_name_column, event_date_column, event_source_column, "2021-12-01", n_days, minimum_avg_event_volume=100)
}}
20 changes: 13 additions & 7 deletions macros/detect_event_anomoly.sql
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
#
#

{% macro test_detect_event_anomaly(model, event_name_column, event_date_column, event_source_column, end_date=avo_audit.date_yesterday(), n_days=15, threshold=2.5) %}
{% macro test_detect_event_anomaly(model, event_name_column, event_date_column, event_source_column, end_date=avo_audit.date_yesterday(), n_days=15, threshold=2.5, minimum_avg_event_volume=0) %}

{% set dt = "cast('" + end_date +"' as date)" %}

Expand Down Expand Up @@ -109,9 +109,9 @@ events_dates_combo as (
),

daily_percentage as (
select event_name, source, day, percentage
select event_name, source, day, event_count, percentage
from union_query
GROUP BY event_name, source, day, percentage
GROUP BY event_name, source, day, event_count, percentage
), avarage as (
-- Get the Avarage and standard deviation of percentages over the time period for all event_name source combinations.
-- This is to be able to check each percentage whether its out of its normal bounds.
Expand All @@ -120,7 +120,8 @@ daily_percentage as (
event_name,
source,
AVG(percentage) as avg_percentage,
STDDEV(percentage) as std_percentage
STDDEV(percentage) as std_percentage,
AVG(event_count) as avg_event_count
from daily_percentage
group by
event_name,
Expand All @@ -146,9 +147,14 @@ daily_percentage as (
MAX(t.percentage) as percentage,
MAX(m.avg_percentage) as avg_percentage,
MAX(m.std_percentage) as std_percentage,
case when MAX(t.percentage) > MAX(m.avg_percentage) + MAX(m.std_percentage) * {{threshold}} then 1
when MAX(t.percentage) < MAX(m.avg_percentage) - MAX(m.std_percentage) * {{threshold}} then -1
else 0
case
when MAX(m.avg_event_count) > {{minimum_avg_event_volume}} then -- anomaly detection does not work for very low volume data.
case
when MAX(t.percentage) > MAX(m.avg_percentage) + MAX(m.std_percentage) * {{threshold}} then 1
when MAX(t.percentage) < MAX(m.avg_percentage) - MAX(m.std_percentage) * {{threshold}} then -1
else 0
end
else 0
end as signal
from union_query t
left join avarage m
Expand Down

0 comments on commit 350dce7

Please sign in to comment.