A DBT application that builds the ShopPad Data Warehouse created by @JonCrawford.
The Mesalytics DBT project is designed to take the pieces of data produced by the production application & 3rd-party data sources such as Segment to turn that raw data into usable data objects that help business stakeholders derive insights, diagnose issues and make decisions. The DBT application is automatically run on a schedule (e.g. "every 3 hours") through getdbt.com. The application is currently tailored to Snowflake but may (relatively) easily be changed to Postgres or Redshift syntax should the underlying source database change.
To contribute to the DBT project, you can choose between the browser-based editor at GetDBT.com or using a local development environment. The browser-based version is recommended for developers making infrequent, inconsequential changes whereas the local development environment is strongly encouraged for serious contributors.
To get a local copy up and running follow these steps.
This is an example of how to list things you need to use the software and how to install them.
- python 10.X
- Whitelisted access from your IP address to Snowflake database.
-
Confirm that you can connect directly to the data warehouse from the console or a GUI such as TablePlus.
-
Clone this repo
git clone https://github.com/shoppad/sp-bi-dbt
-
Install Virtualenv
brew install virtualenv
-
cd
into the repo -
virtualenv venv
to create a new virtual env for this folder. -
Activate the virtual env. It is recommended that you use an auto-activating virtualenv activator plugin such as this one for Zsh.
-
Install project python dependencies.
pip install -r requirements.txt
-
Install project DBT lib dependencies
dbt deps
-
Set up your personal development profile in ~/.dbt/profiles.yml. Instructions here
shoppad: outputs: dev: account: <fill> database: mongo password: <fill role: PIPELINE schema: dbt_<pick-a-personal-schema> threads: 4 type: snowflake user: <fill></fill> warehouse: dbt_warehouse query_tag: dbt-dev-<pick-a-personal-schema> prod: account: <fill> database: mongo password: <fill> role: PIPELINE schema: analytics threads: 4 type: snowflake user: <fill> warehouse: dbt_warehouse query_tag: dbt-prod-<pick-a-personal-username> target: dev
-
Test your connection with
dbt seed
. This will create a schema calleddbt_[yourname]_seeds
.
dbt seed
-
Browse the newly-created scheam (dbt_[yourname]_sseds) from the console or TablePlus to ensure it was successfully created.
-
Build the full application into your schema. This will build that data warehouse and also run the test suite to ensure the resulting data is valid.
dbt build
Try running the following commands:
dbt deps
dbt seed
dbt run
dbt test
dbt build
dbt run -m my_model
dbt run -m my_model+
#Add the + at the end.
The application's models are structured into 3 types & folders.
- staging: The initial import of raw source data with filtering and slight formatting.
- intermdiate: intermediary formatting and supporting models that are not business user-facing.
- marts: The finalized production-ready business insight models.
These are list of counter-intuitive production data issues that may surprise developers working on the project.
[List any that arise.]
Some data concessions and compromises have been made during development to bridge gaps or errors in the production data. This is a running list of those compromises intended to be pruned and removed as the production application improves data holes. Code pertaining to a Concession is tagged with #concession
.
- Workflows and Workflow Steps have been previously deleted and can not be reliabliy used as sources or counts for other models.
-
The DBT Style Guide – The application has been developed working as closely as possible to the DBT organization's best practices and recommended code styles.
-
There are several VS Code extensions that are recommended in the workspace. It is recommended to always open the application via the workspace for local development with
code mesa_dbt.workspace
.