Summary
This release adds early-release support for running the models on a Snowflake lake loader Iceberg table. Please note this feature is actively in testing and not all features of the package may work as expected. Please let us know if you find any issues.
Features
- Add early release support for Snowflake Iceberg events tables via lake loader
Optimizations
- Remove unused columns and attempt to improve window functions in user_sessions_this_run
Under the hood
- Alter GH action to remove fail-fast on integration tests
Upgrading
Update the snowplow-unified version in your packages.yml
file.
snowplow-unified 0.4.3 (2024-06-24)
Summary
This release brings a couple of optimizations and fixes: it fixes a syntax error when snowplow__use_refr_if_mkt_null
is enabled, optimizes the test snowplow_tests_view_in_session_values
, restructures the way user_sessions_this_run
table is created to possibly help with query optimization to reduce processing time, and adds further initialization checks which are now tied to the snowplow__enable_initial_checks
variable.
Features
- Add more robust init tests
Optimizations
- Move tests view in session values to this run (Close snowplow#56)
- Restructure user sessions this run
Fixes
- Fix syntax when snowplow__use_refr_if_mkt_null enabled
Upgrading
Update the snowplow-unified version in your packages.yml
file.
snowplow-unified 0.4.2 (2024-06-12)
Summary
This release introduces enhancements to the initialization checks and refinements in data handling, particularly in Databricks/Spark environments. These updates ensure smoother operations and better data integrity checks.
Features
- Introduced a new validation test to halt the process early if all required seeds are not present in the data warehouse. This ensures all necessary data components are available before proceeding.
Fixes
- Revised the field casting approach within Databricks/Spark to enhance data handling and compatibility.
- Removed the non-null constraint test on
user_identifier
to prevent unnecessary validation errors and allow for more flexible data integration.
Upgrading
Update the snowplow-unified version in your packages.yml
file to take advantage of these improvements.
snowplow-unified 0.4.1 (2024-05-13)
Summary
This feature updates our default_channel_group
definition to support the use of refr_
fields in the case that the mkt_
fields are null. This is a requested feature in particular for landing pages with redirects. This functionality is turned off by default.
Features
- New
snowplow__use_refr_if_mkt_null
variable to userefr_
fields ifmkt_
ones are null in default channel group classification
Fixes
- Fix an issue in the channel group classification where direct channels were sometimes ignored due to string checks
Upgrading
Bump the snowplow-unified version in your packages.yml
file.
snowplow-unified 0.4.0 (2024-03-25)
Summary
This release adds a surrogate key to the conversions table, in case of an event being valid against multiple conversion types, and fixes an issue with bigquery if snowplow
was in your project name.
🚨 Breaking Changes 🚨
- Adds a new surrogate key to the optional conversions table, to allow for the same event to be part of multiple conversions
Fixes
- Fix an issue where having
snowplow
in your project name caused issue when using a bigquery target
Upgrading
Bump the snowplow-unified version in your packages.yml
file. If you already make use of the conversions table and wish to not do a full refresh, you can add the new column by following the migration guide here.
snowplow-unified 0.3.1 (2024-03-11)
Summary
This release fixes an issue where it was not possible to full refresh a single table using the models_to_remove
variable, as well as removing a non-valid tests on the conversions table.
Fixes
- Fix missing argument in
snowplow_utils.snowplow_delete_from_manifest
call - Remove null test on
user_id
in conversions table
Upgrading
Bump the snowplow-unified version in your packages.yml
file.
snowplow-unified 0.3.0 (2024-02-26)
Summary
This release adds one major new feature, which is custom aggregations on the views, sessions, and users tables. You can read more about it in our docs here. We also added the ability to manage grants to all tables in the package via the snowplow__grant_select_to
variable.
Under the hood we did a lot of small tweaks and improvements including prefixing all macro calls for easier custom models, moved cluster by fields to macros, ensure that the manifest tables are only full refreshed when snowplow__allow_refresh
is set to true AND there is a full refresh flag on the run, and added a few context fields to the derived tables that were being discarded previously.
🚨 Breaking Changes 🚨
- We have changed the behavior of the
allow_refresh
macro so now ifsnowplow__allow_refresh
is set totrue
it will only refresh the manifest models if the--full-refresh
flag is also set. If you require the old behavior where it would refresh the manifest models on an incremental run whensnowplow__allow_refresh
was set totrue
, please overwrite this macro. See the Overriding Macros guide for more details. - Renamed
snowplow__page_view_passthroughs
tosnowplow__view_passthroughs
to be consistent with the rest of the package - Minimum
snowplow-utils
version is now 0.16.2
Features
- Add new passthrough aggregations to the views, sessions, and users table, enabled using
snowplow__view/session/user_aggregations
- Reorder and add some additional context fields to derived tables (non-breaking change)
- Add
snowplow__custom_sql
to allow adding custom sql to thesnowplow_unified_base_events_this_run
andsnowplow_unified_events_this_run
models - Add macro to define cluster-by for tables to allow users to overwrite this if required
- Add check for
--full-refresh
flag before allowing refresh of manifest models whensnowplow__allow_refresh
is set totrue
. - Add ability to grant select to a list of users, principals or roles on tables created by the package using
snowplow__grant_select_to
for all warehouses except BigQuery - Add auto-grant of
usage
on schemas tosnowplow__grant_select_to
, can be disabled usingsnowplow__grant_schema_usage
(see docs at https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/package-features/table-grants/)
Fixes
- Fix a bug where if you ran the package in a period with no data, and had list all events enabled, the package would error rather than complete
- Fix incorrect tagging in app errors module tables
Under the hood
- Prefix all macro calls with package name for easier customization
- Use macros for grouped fields (e.g. contexts) where possible
- Bump actions version numbers
Upgrading
Bump the snowplow-unified version in your packages.yml
file, paying attention to the breaking changes above..
snowplow-unified 0.2.0 (2024-01-30)
Summary
This release adds the ability to calculate mobile screen engagement using the screen summary context. There is also a new optional module for a conversions table. Other changes are the ability to stitch the users table during session stitching and heatset is a recognised platform now.
🚨 Breaking Changes 🚨
Existing users on Snowflake / Databricks / Redshift will need to make changes to some of their derived tables. For a full sql script on how to achieve this, check out the relevant migration guide. The other option is to do a complete refresh of the package.
Features
- Add mobile screen engagement calculation using the screen summary context (snowplow#16)
- Adds user stitching to the users table (enabled with
snowplow__session_stitching
) - Adds "headset" to the list of recognized platforms
- Add optional conversions module
Fixes
- Consider screen view ID from the screen view context (snowplow#14)
- Fix link to incorrect FAQ in README
- Remove test for not null screen ID and name in app errors table
Upgrading
Bump the snowplow-unified version in your packages.yml
file.
snowplow-unified 0.1.2 (2023-11-23)
Summary
This is a patch release to fix the default browser context variable for warehouses other than redshift/postgres.
Fixes
- Fix browser context
Upgrading
Bump the snowplow-unified version in your packages.yml
file.
snowplow-unified 0.1.1 (2023-11-21)
Summary
This release is for supporting multiple versions of the session context schema for Bigquery (mobile) users not just contexts_com_snowplowanalytics_snowplow_client_session_1_0_0 out-of-the-box.
Features
- Support latest session context schema
Github
packages:
- git: "https://github.com/snowplow/dbt-snowplow-unified.git"
revision: 0.1.1
dbt hub
packages:
- package: snowplow/snowplow_unified
version: [">=0.1.0", "<0.2.0"]
snowplow-unified 0.1.0 (2023-11-14)
Summary
This is the first official release of the Snowplow Unified package, which contains a fully incremental model that transforms raw web and mobile event data generated by the Snowplow JavaScript and mobile trackers into a set of derived tables: views, sessions and users.
Features
- dbt Package that processes web and mobile events simultaneously
- Support for Snowflake / BigQuery / Databricks / Redshift / Postgres
- optional modules such as consent, app errors and web performance (core web vitals)
Please note that this data model is under the Snowplow Personal & Academic License (SPAL). For further details please refer to our documenation site.
Installation
To install the package, add the following to the packages.yml
in your project:
Github
packages:
- git: "https://github.com/snowplow/dbt-snowplow-unified.git"
revision: 0.1.0
dbt hub
Please note that it may be a few days before the package is available on dbt hub after the initial release.
packages:
- package: snowplow/snowplow_unified
version: [">=0.1.0", "<0.2.0"]