Skip to content
This repository has been archived by the owner on Oct 2, 2024. It is now read-only.

snowplow-unified v0.4.4

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 02 Jul 03:56

Summary

This release adds early-release support for running the models on a Snowflake lake loader Iceberg table. Please note this feature is actively in testing and not all features of the package may work as expected. Please let us know if you find any issues.

Features

  • Add early release support for Snowflake Iceberg events tables via lake loader

Optimizations

  • Remove unused columns and attempt to improve window functions in user_sessions_this_run

Under the hood

  • Alter GH action to remove fail-fast on integration tests

Upgrading

Update the snowplow-unified version in your packages.yml file.

snowplow-unified 0.4.3 (2024-06-24)

Summary

This release brings a couple of optimizations and fixes: it fixes a syntax error when snowplow__use_refr_if_mkt_null is enabled, optimizes the test snowplow_tests_view_in_session_values, restructures the way user_sessions_this_run table is created to possibly help with query optimization to reduce processing time, and adds further initialization checks which are now tied to the snowplow__enable_initial_checks variable.

Features

  • Add more robust init tests

Optimizations

  • Move tests view in session values to this run (Close snowplow#56)
  • Restructure user sessions this run

Fixes

  • Fix syntax when snowplow__use_refr_if_mkt_null enabled

Upgrading

Update the snowplow-unified version in your packages.yml file.

snowplow-unified 0.4.2 (2024-06-12)

Summary

This release introduces enhancements to the initialization checks and refinements in data handling, particularly in Databricks/Spark environments. These updates ensure smoother operations and better data integrity checks.

Features

  • Introduced a new validation test to halt the process early if all required seeds are not present in the data warehouse. This ensures all necessary data components are available before proceeding.

Fixes

  • Revised the field casting approach within Databricks/Spark to enhance data handling and compatibility.
  • Removed the non-null constraint test on user_identifier to prevent unnecessary validation errors and allow for more flexible data integration.

Upgrading

Update the snowplow-unified version in your packages.yml file to take advantage of these improvements.

snowplow-unified 0.4.1 (2024-05-13)

Summary

This feature updates our default_channel_group definition to support the use of refr_ fields in the case that the mkt_ fields are null. This is a requested feature in particular for landing pages with redirects. This functionality is turned off by default.

Features

  • New snowplow__use_refr_if_mkt_null variable to use refr_ fields if mkt_ ones are null in default channel group classification

Fixes

  • Fix an issue in the channel group classification where direct channels were sometimes ignored due to string checks

Upgrading

Bump the snowplow-unified version in your packages.yml file.

snowplow-unified 0.4.0 (2024-03-25)

Summary

This release adds a surrogate key to the conversions table, in case of an event being valid against multiple conversion types, and fixes an issue with bigquery if snowplow was in your project name.

🚨 Breaking Changes 🚨

  • Adds a new surrogate key to the optional conversions table, to allow for the same event to be part of multiple conversions

Fixes

  • Fix an issue where having snowplow in your project name caused issue when using a bigquery target

Upgrading

Bump the snowplow-unified version in your packages.yml file. If you already make use of the conversions table and wish to not do a full refresh, you can add the new column by following the migration guide here.

snowplow-unified 0.3.1 (2024-03-11)

Summary

This release fixes an issue where it was not possible to full refresh a single table using the models_to_remove variable, as well as removing a non-valid tests on the conversions table.

Fixes

  • Fix missing argument in snowplow_utils.snowplow_delete_from_manifest call
  • Remove null test on user_id in conversions table

Upgrading

Bump the snowplow-unified version in your packages.yml file.

snowplow-unified 0.3.0 (2024-02-26)

Summary

This release adds one major new feature, which is custom aggregations on the views, sessions, and users tables. You can read more about it in our docs here. We also added the ability to manage grants to all tables in the package via the snowplow__grant_select_to variable.

Under the hood we did a lot of small tweaks and improvements including prefixing all macro calls for easier custom models, moved cluster by fields to macros, ensure that the manifest tables are only full refreshed when snowplow__allow_refresh is set to true AND there is a full refresh flag on the run, and added a few context fields to the derived tables that were being discarded previously.

🚨 Breaking Changes 🚨

  • We have changed the behavior of the allow_refresh macro so now if snowplow__allow_refresh is set to true it will only refresh the manifest models if the --full-refresh flag is also set. If you require the old behavior where it would refresh the manifest models on an incremental run when snowplow__allow_refresh was set to true, please overwrite this macro. See the Overriding Macros guide for more details.
  • Renamed snowplow__page_view_passthroughs to snowplow__view_passthroughs to be consistent with the rest of the package
  • Minimum snowplow-utils version is now 0.16.2

Features

  • Add new passthrough aggregations to the views, sessions, and users table, enabled using snowplow__view/session/user_aggregations
  • Reorder and add some additional context fields to derived tables (non-breaking change)
  • Add snowplow__custom_sql to allow adding custom sql to the snowplow_unified_base_events_this_run and snowplow_unified_events_this_run models
  • Add macro to define cluster-by for tables to allow users to overwrite this if required
  • Add check for --full-refresh flag before allowing refresh of manifest models when snowplow__allow_refresh is set to true.
  • Add ability to grant select to a list of users, principals or roles on tables created by the package using snowplow__grant_select_to for all warehouses except BigQuery
  • Add auto-grant of usage on schemas to snowplow__grant_select_to, can be disabled using snowplow__grant_schema_usage (see docs at https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/package-features/table-grants/)

Fixes

  • Fix a bug where if you ran the package in a period with no data, and had list all events enabled, the package would error rather than complete
  • Fix incorrect tagging in app errors module tables

Under the hood

  • Prefix all macro calls with package name for easier customization
  • Use macros for grouped fields (e.g. contexts) where possible
  • Bump actions version numbers

Upgrading

Bump the snowplow-unified version in your packages.yml file, paying attention to the breaking changes above..

snowplow-unified 0.2.0 (2024-01-30)

Summary

This release adds the ability to calculate mobile screen engagement using the screen summary context. There is also a new optional module for a conversions table. Other changes are the ability to stitch the users table during session stitching and heatset is a recognised platform now.

🚨 Breaking Changes 🚨

Existing users on Snowflake / Databricks / Redshift will need to make changes to some of their derived tables. For a full sql script on how to achieve this, check out the relevant migration guide. The other option is to do a complete refresh of the package.

Features

  • Add mobile screen engagement calculation using the screen summary context (snowplow#16)
  • Adds user stitching to the users table (enabled with snowplow__session_stitching)
  • Adds "headset" to the list of recognized platforms
  • Add optional conversions module

Fixes

  • Consider screen view ID from the screen view context (snowplow#14)
  • Fix link to incorrect FAQ in README
  • Remove test for not null screen ID and name in app errors table

Upgrading

Bump the snowplow-unified version in your packages.yml file.

snowplow-unified 0.1.2 (2023-11-23)

Summary

This is a patch release to fix the default browser context variable for warehouses other than redshift/postgres.

Fixes

  • Fix browser context

Upgrading

Bump the snowplow-unified version in your packages.yml file.

snowplow-unified 0.1.1 (2023-11-21)

Summary

This release is for supporting multiple versions of the session context schema for Bigquery (mobile) users not just contexts_com_snowplowanalytics_snowplow_client_session_1_0_0 out-of-the-box.

Features

  • Support latest session context schema

Github

packages:
  - git: "https://github.com/snowplow/dbt-snowplow-unified.git"
    revision: 0.1.1

dbt hub

packages:
  - package: snowplow/snowplow_unified
    version: [">=0.1.0", "<0.2.0"]

snowplow-unified 0.1.0 (2023-11-14)

Summary

This is the first official release of the Snowplow Unified package, which contains a fully incremental model that transforms raw web and mobile event data generated by the Snowplow JavaScript and mobile trackers into a set of derived tables: views, sessions and users.

Features

  • dbt Package that processes web and mobile events simultaneously
  • Support for Snowflake / BigQuery / Databricks / Redshift / Postgres
  • optional modules such as consent, app errors and web performance (core web vitals)

Please note that this data model is under the Snowplow Personal & Academic License (SPAL). For further details please refer to our documenation site.

Installation

To install the package, add the following to the packages.yml in your project:

Github

packages:
  - git: "https://github.com/snowplow/dbt-snowplow-unified.git"
    revision: 0.1.0

dbt hub

Please note that it may be a few days before the package is available on dbt hub after the initial release.

packages:
  - package: snowplow/snowplow_unified
    version: [">=0.1.0", "<0.2.0"]