Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: call aws secretsmanager #1704

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 99 additions & 0 deletions dataeng/profiles/profiles.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# This prevents dbt from rebuilding the DAG every run, saving time on jobs where it is run repeatedly
config:
partial_parse: True

warehouse_transforms: # This is the name to use for "profile"
target: prod # The default target.
outputs: # Each of the keys in this dict is a "target"
dev:
type: snowflake
account: edx.us-east-1
user: DBT_TRANSFORMER
role: DBT_TRANSFORMER_ROLE
password: "{{ env_var('DBT_PASSWORD') }}"
database: DEV
warehouse: TRANSFORMING
# This becomes the prefix for schemas written to by this target, e.g. "123456789_finance" if DEV_SCHEMA_PREFIX=123456789.
# dbt calls this the "target schema" in their docs
schema: "{{ env_var('DEV_SCHEMA_PREFIX') }}"
threads: 10
prod:
type: snowflake
account: edx.us-east-1
user: DBT_TRANSFORMER
role: DBT_TRANSFORMER_ROLE
password: "{{ env_var('DBT_PASSWORD') }}"
database: PROD
warehouse: TRANSFORMING
# This is not a real schema name or schema prefix---the generate_schema_name_for_env macro will take care of
# removing this prefix string. Resulting schema names end up being just "finance" or whatever is defined in the
# schema.yml level as a "custom schema".
# dbt calls this the "target schema" in their docs
schema: NO_PREFIX
threads: 10
edge:
type: snowflake
account: edx.us-east-1
user: DBT_TRANSFORMER
role: DBT_TRANSFORMER_ROLE
password: "{{ env_var('DBT_PASSWORD') }}"
database: EDGE
warehouse: TRANSFORMING
# This is not a real schema name or schema prefix---the generate_schema_name_for_env macro will take care of
# removing this prefix string. Resulting schema names end up being just "finance" or whatever is defined in the
# schema.yml level as a "custom schema".
# dbt calls this the "target schema" in their docs
schema: NO_PREFIX
threads: 10
prod_amplitude:
# This target is specifically for running amplitude models, which are relatively time sensitive.
type: snowflake
account: edx.us-east-1
user: DBT_TRANSFORMER
role: DBT_TRANSFORMER_ROLE
password: "{{ env_var('DBT_PASSWORD') }}"
database: PROD
warehouse: TRANSFORMING
schema: NO_PREFIX
threads: 10
prod_load_incremental:
# This environment is for dbt initial loading (or reloading) of large incremental tables in DBT that need a larger warehouse
type: snowflake
account: edx.us-east-1
user: DBT_TRANSFORMER
role: DBT_TRANSFORMER_ROLE
password: "{{ env_var('DBT_PASSWORD') }}"
database: PROD
warehouse: TRANSFORMING_XL
# This is not a real schema name or schema prefix---the generate_schema_name_for_env macro will take care of
# removing this prefix string. Resulting schema names end up being just "finance" or whatever is defined in the
# schema.yml level as a "custom schema".
# dbt calls this the "target schema" in their docs
schema: NO_PREFIX
threads: 10
ci_tests:
type: snowflake
account: edx.us-east-1
user: DBT_TRANSFORMER_CI
role: DBT_TRANSFORMER_CI_ROLE
password: "{{ env_var('DBT_PASSWORD') }}"
database: CI_TESTS
warehouse: TRANSFORMING_CI
# The following schema name CI_SCHEMA_NAME is actually a keyword that is used to search and replace in a
# sed command. That is replaced by the actual schema name generated in jenkins job such as PR_1724.
#schema: CI_SCHEMA_NAME
schema: "{{ env_var('CI_SCHEMA_NAME') }}"
threads: 10
ci_tests_large:
type: snowflake
account: edx.us-east-1
user: DBT_TRANSFORMER_CI
role: DBT_TRANSFORMER_CI_ROLE
password: "{{ env_var('DBT_PASSWORD') }}"
database: CI_TESTS
warehouse: TRANSFORMING_CI_L
# The following schema name CI_SCHEMA_NAME is actually a keyword that is used to search and replace in a
# sed command. That is replaced by the actual schema name generated in jenkins job such as PR_1724.
#schema: CI_SCHEMA_NAME
schema: "{{ env_var('CI_SCHEMA_NAME') }}"
threads: 10
4 changes: 3 additions & 1 deletion dataeng/resources/model-transfers.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,9 @@ fi

ARGS="{mart: ${MART_NAME} }"

dbt deps --profiles-dir $WORKSPACE/analytics-secure/warehouse-transforms/ --profile $DBT_PROFILE --target $DBT_TARGET
source secrets-manager.sh analytics-secure/warehouse-transforms/profiles DBT_PASSWORD

dbt deps --profiles-dir $WORKSPACE/dataeng/profiles --profile $DBT_PROFILE --target $DBT_TARGET
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the workspace is a jenkins provided environment variable that points to the jenkins job workspace.
jenkins-job-dsl repo. does not get cloned unless we specify it in the DSL.


# Call DBT to perform all transfers for this mart.
dbt run-operation perform_s3_transfers --args "${ARGS}" --profile $DBT_PROFILE --target $DBT_TARGET --profiles-dir $WORKSPACE/analytics-secure/warehouse-transforms/
Loading