Skip to content

Commit

Permalink
Merge pull request #8735 from braze-inc/bd-3496
Browse files Browse the repository at this point in the history
BD-3496 Apply template standards to Cloud Ingestion
  • Loading branch information
lydia-xie authored Jan 10, 2025
2 parents e0e6e7d + 275ac51 commit 39d7e9e
Show file tree
Hide file tree
Showing 8 changed files with 67 additions and 64 deletions.
2 changes: 1 addition & 1 deletion _docs/_user_guide/data_and_analytics/cloud_ingestion.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ page_order: 0.1
page_type: landing

guide_top_header: "Braze Cloud Data Ingestion"
guide_top_text: "<h2>What is Braze Cloud Data Ingestion?</h2>Braze Cloud Data Ingestion (CDI) allows you to set up a direct connection from your data storage solution to Braze to sync relevant user or catalog data, and delete users. When synced to Braze, this data can be leveraged for use cases such as personalization or segmentation. Cloud Data Ingestion's flexible integration supports complex data structures including nested JSON and arrays of objects. <br><br>**Braze Cloud Data Ingestion capabilities:**<br> - Create a simple integration directly from your data warehouse or file storage solution to Braze in just a few minutes.<br>- Securely sync user data, including attributes, events, and purchases from your data warehouse to Braze.<br>- Close the data loop with Braze by combining Cloud Data Ingestion with Currents or Snowflake Data Sharing.<br><br>**Cloud Data Ingestion can sync data from**:<br> - Amazon Redshift<br> - Databricks<br> - Google BigQuery<br> - Microsoft Fabric<br> - S3<br> - Snowflake"
guide_top_text: "<h2>What is it?</h2>Braze Cloud Data Ingestion (CDI) allows you to set up a direct connection from your data storage solution to Braze to sync relevant user or catalog data, and delete users. When synced to Braze, this data can be leveraged for use cases such as personalization or segmentation. Cloud Data Ingestion's flexible integration supports complex data structures including nested JSON and arrays of objects. <br><br>**Braze Cloud Data Ingestion capabilities:**<br> - Create a simple integration directly from your data warehouse or file storage solution to Braze in just a few minutes.<br>- Securely sync user data, including attributes, events, and purchases from your data warehouse to Braze.<br>- Close the data loop with Braze by combining Cloud Data Ingestion with Currents or Snowflake Data Sharing.<br><br>**Cloud Data Ingestion can sync data from**:<br> - Amazon Redshift<br> - Databricks<br> - Google BigQuery<br> - Microsoft Fabric<br> - S3<br> - Snowflake"

guide_featured_title: "Section articles"
guide_featured_list:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
nav_title: Connected Sources
article_title: Connected Sources
description: "This reference article covers how to use Braze Cloud Data Ingestion to sync relevant data with your Snowflake, Redshift, BigQuery, and Databricks integration."
description: "This page covers how to use Braze Cloud Data Ingestion to sync relevant data with your Snowflake, Redshift, BigQuery, and Databricks integration."
page_order: 2
page_type: reference

Expand All @@ -13,7 +13,7 @@ page_type: reference
After adding a connected source to your Braze workspace, you can create a CDI segment within Segment Extensions. CDI segments let you write SQL that directly queries your data warehouse (using data there that’s made available through your CDI Connected Source), and creates and maintains a group of users that can be targeted within Braze.

For more information on creating a segment with this source, view [CDI segments]({{site.baseurl}}/user_guide/engagement_tools/segments/segment_extension/cdi_segments/).
For more information on creating a segment with this source, refer to [CDI segments]({{site.baseurl}}/user_guide/engagement_tools/segments/segment_extension/cdi_segments/).

{% alert warning %}
Because connected sources run on your data warehouse directly, you will incur all costs associated with running these queries in your data warehouse. Connected sources don't consume data points, and CDI segments don't consume SQL segment credits.
Expand Down Expand Up @@ -141,7 +141,7 @@ When connecting different workspaces to the same Snowflake account, you must cre

#### Step 2.4: Allow Braze IPs in your Snowflake network policy (optional)

Depending on the configuration of your Snowflake account, you may need to allow the following IP addresses in your Snowflake network policy. For more information on doing this, view the relevant Snowflake documentation on [modifying a network policy](https://docs.snowflake.com/en/user-guide/network-policies.html#modifying-network-policies).
Depending on the configuration of your Snowflake account, you may need to allow the following IP addresses in your Snowflake network policy. For more information on doing this, refer to the relevant Snowflake documentation on [modifying a network policy](https://docs.snowflake.com/en/user-guide/network-policies.html#modifying-network-policies).

{% subtabs %}
{% subtab United States (US) %}
Expand Down Expand Up @@ -194,7 +194,7 @@ If you have a firewall or other network policies, you must give Braze network ac

You may also need to change your security groups to allow Braze access to your data in Redshift. Make sure to explicitly allow inbound traffic on the IPs below and on the port used to query your Redshift cluster (default is 5439). You should explicitly allow Redshift TCP connectivity on this port even if the inbound rules are set to "allow all". In addition, it is important that the endpoint for the Redshift cluster be publicly accessible in order for Braze to connect to your cluster.

If you don't want your Redshift cluster to be publicly accessible, you can set up a VPC and EC2 instance to use an ssh tunnel to access the Redshift data. For more information, see [AWS: How do I access a private Amazon Redshift cluster from my local machine?](https://repost.aws/knowledge-center/private-redshift-cluster-local-machine)
If you don't want your Redshift cluster to be publicly accessible, you can set up a VPC and EC2 instance to use an ssh tunnel to access the Redshift data. For more information, refer to [AWS: How do I access a private Amazon Redshift cluster from my local machine?](https://repost.aws/knowledge-center/private-redshift-cluster-local-machine)

{% subtabs %}
{% subtab United States (US) %}
Expand Down Expand Up @@ -241,7 +241,7 @@ You may choose to grant access to all tables in a dataset, or grant privileges o

The `create table` permission is required so Braze can create a table with your CDI Segment query results before updating the segment in Braze. Braze will create a temporary table per segment, and the table will only persist while Braze is updating the segment.

After creating the service account and granting permissions, generate a JSON key. For more information, view [Google Cloud: Create and delete service account keys](https://cloud.google.com/iam/docs/keys-create-delete). You'll upload this to the Braze dashboard later.
After creating the service account and granting permissions, generate a JSON key. For more information, refer to [Google Cloud: Create and delete service account keys](https://cloud.google.com/iam/docs/keys-create-delete). You'll upload this to the Braze dashboard later.

#### Step 2.2: Allow access to Braze IPs

Expand Down Expand Up @@ -335,29 +335,29 @@ Braze will connect to your Fabric warehouse using a service principal with Entra
* Principal ID (also called application ID) for the service principal
* Client secret for Braze to authenticate

1. In the Azure portal, navigate to Microsoft Entra admin center, and then App Registrations
1. In the Azure portal, navigate to the Microsoft Entra admin center, and then **App Registrations**.
2. Select **+ New registration** under **Identity > Applications > App registrations**
3. Enter a name, and select `Accounts in this organizational directory only` as the supported account type. Then, select **Register**.
4. Select the application (service principal) you just created, then navigate to **Certificates & secrets > + New client secret**
5. Enter a description for the secret, and set an expiry period for the secret. Then, click add.
5. Enter a description for the secret, and set an expiry period for the secret. Then, select **Add**.
6. Note the client secret created to use in the Braze setup.

{% alert note %}
Azure does not allow unlimited expiry on service principal secrets. Remember to refresh the credentials before they expire in order to maintain the flow of data to Braze.
Azure doesn't allow unlimited expiry on service principal secrets. Remember to refresh the credentials before they expire in order to maintain the flow of data to Braze.
{% endalert %}

#### Step 2.2: Grant access to Fabric resources
You will provide access for Braze to connect to your Fabric instance. In your Fabric admin portal, navigate to **Settings > Governance and insights > Admin portal > Tenant settings**.
You will provide access for Braze to connect to your Fabric instance. In your Fabric admin portal, navigate to **Settings** > **Governance and insights** > **Admin portal** > **Tenant settings**.

* In **Developer settings** enable "Service principals can use Fabric APIs" so Braze can connect using Microsoft Entra ID.
* In **OneLake settings** enable "Users can access data stored in OneLake with apps external to Fabric" so that the service principal can access data from an external app.

#### Step 2.3: Get warehouse connection string

You will need the SQL endpoint for your warehouse in order for Braze to connect. To retrieve the SQL endpoint, go to the **workspace** in Fabric, and in the list of items, hover over the warehouse name and select **Copy SQL connection string**.

![The "Fabric Console" page in Microsoft Azure, where users should retrieve the SQL Connection String.]({% image_buster /assets/img/cloud_ingestion/fabric_1.png %})


#### Step 2.4: Allow Braze IPs in Firewall (Optional)

Depending on the configuration of your Microsoft Fabric account, you may need to allow the following IP addresses in your firewall to allow traffic from Braze. For more information on enabling this, see the relevant documentation on [Entra Conditional Access](https://learn.microsoft.com/en-us/fabric/security/protect-inbound-traffic#entra-conditional-access).
Expand Down Expand Up @@ -602,7 +602,7 @@ You may set up multiple sources with Braze, but each source should be configured

## Using the connected source

After the source is created, it can be used to create one or more CDI segments. For more information on creating a segment with this source, see the [CDI Segments documentation]({{site.baseurl}}/user_guide/engagement_tools/segments/segment_extension/cdi_segments/).
After the source is created, you can use it to create one or more CDI segments. For more information on creating a segment with this source, refer to the [CDI Segments documentation]({{site.baseurl}}/user_guide/engagement_tools/segments/segment_extension/cdi_segments/).

{% alert note %}
If queries are consistently timing out and you have set a maximum runtime of 60 minutes, consider trying to optimize your query execution time or dedicating more compute resources (such as a larger warehouse) to the Braze user.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,25 @@ nav_title: Delete Users with CDI
article_title: Delete Users with Cloud Data Ingestion
page_order: 30
page_type: reference
description: "This reference article provides an overview of the process for deleting users with Cloud Data Ingestion."
description: "This pgae provides an overview of the process for deleting users with Cloud Data Ingestion."

---

# Delete users with Cloud Data Ingestion

> This page discusses the process for deleting users with Cloud Data Ingestion.
User delete syncs are supported for all available Cloud Data Ingestion data sources.

## Integration configuration
## Configuring the integration

Follow the standard process to [create a new integration in the Braze dashboard]({{site.baseurl}}/user_guide/data_and_analytics/cloud_ingestion/integrations/#step-1-set-up-tables-or-views) for the data warehouse you want to connect to. Ensure that you include a role that can access the delete table. On the **Create import sync** page, set the **Data Type** to **Delete Users**. This will ensure the proper actions are taken during the integration run to delete users.
Follow the standard process to [create a new integration in the Braze dashboard]({{site.baseurl}}/user_guide/data_and_analytics/cloud_ingestion/integrations/#step-1-set-up-tables-or-views) for the data warehouse you want to connect to. Ensure that you include a role that can access the delete table. On the **Create import sync** page, set the **Data Type** to **Delete Users** so that the proper actions are taken during the integration run to delete users.

![]({% image_buster /assets/img/cloud_ingestion/deletion_1.png %})

## Source data configuration
## Configuring source data

Source tables for user deletes should include one or more user identifier types and an `UPDATED_AT` timestamp. Payload columns are not supported for user delete data.
Source tables for user deletes should include one or more user identifier types and an `UPDATED_AT` timestamp. Payload columns aren't supported for user delete data.

### `UPDATED_AT`

Expand Down
16 changes: 8 additions & 8 deletions _docs/_user_guide/data_and_analytics/cloud_ingestion/faqs.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,29 +3,29 @@ nav_title: FAQs
article_title: Cloud Data Ingestion FAQs
page_order: 100
page_type: FAQ
description: "This article answers frequently asked questions about Cloud Data Ingestion."
description: "This page answers frequently asked questions about Cloud Data Ingestion."
toc_headers: h2
---

# Frequently Asked Questions (FAQ)

> These are the answers to some frequently asked questions for Cloud Data Ingestion.
> This page contains answers to some frequently asked questions for Cloud Data Ingestion.
## Why was I emailed: "Error in CDI Sync"?

This type of email usually means there's an issue with your CDI setup. Here are some common issues and how to fix them:

### CDI can't access the data warehouse or table using your credentials

This could mean the credentials in CDI are incorrect or are misconfigured on the data warehouse. For more information, see [Data Warehouse Integrations]({{site.baseurl}}/user_guide/data_and_analytics/cloud_ingestion/integrations/).
This could mean the credentials in CDI are incorrect or are misconfigured on the data warehouse. For more information, refer to [Data Warehouse Integrations]({{site.baseurl}}/user_guide/data_and_analytics/cloud_ingestion/integrations/).

### The table cannot be found

Try updating your integration with the correct database configuration or create matching resources on the data warehouse, such as `database/table`.

### The catalog cannot be found

The catalog set up in the integration does not exist in the Braze catalog. A catalog can be removed after the integration was set up. To resolve the issue, either update the integration to use a different catalog or create a new catalog that matches the catalog name in the integration.
The catalog set up in the integration doesn't exist in the Braze catalog. A catalog can be removed after the integration was set up. To resolve the issue, either update the integration to use a different catalog or create a new catalog that matches the catalog name in the integration.

## Why was I emailed: "Row errors in your CDI sync"?

Expand All @@ -41,11 +41,11 @@ Test Connection is running on your data warehouse, so increasing warehouse capac

### Error connecting to Snowflake instance: Incoming request with IP is not allowed to access Snowflake

Try adding the official Braze IPs to your IP allowlist. For more information, see [Data Warehouse Integrations]({{site.baseurl}}/user_guide/data_and_analytics/cloud_ingestion/integrations/).
Try adding the official Braze IPs to your IP allowlist. For more information, refer to [Data Warehouse Integrations]({{site.baseurl}}/user_guide/data_and_analytics/cloud_ingestion/integrations/).

### Error executing SQL due to customer config: 002003 (42S02): SQL compilation error: does not exist or not authorized

If the table does not exist, create the table. If the table does exist, verify that the user and role have permissions to read from the table.
If the table doesn't exist, create the table. If the table does exist, verify that the user and role have permissions to read from the table.

### Could not use schema

Expand All @@ -61,7 +61,7 @@ If you receive this error, allow that user access to your Snowflake account.

### Error connecting to Snowflake instance with current and old key

If you receive this error, make sure the user is using the current public key as seen in your braze dashboard.
If you receive this error, make sure the user is using the current public key as displayed in your Braze dashboard.
{% endtab %}

{% tab Redshift %}
Expand Down Expand Up @@ -96,7 +96,7 @@ Test Connection is running on your data warehouse, so increasing warehouse capac

### User does not have permission to query table

If you receive this error, add User permissions to query the table.
If you receive this error, add user permissions to query the table.

### Your usage exceeded the custom quota

Expand Down
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
---
nav_title: File Storage Integrations
article_title: File Storage Integrations
description: "This reference article covers Braze Cloud Data Ingestion and how to sync relevant data from S3 to Braze"
description: "This page covers Braze Cloud Data Ingestion and how to sync relevant data from S3 to Braze"
page_order: 3
page_type: reference

---

# File storage integrations

> This article covers how to set up Cloud Data Ingestion support and sync relevant data from S3 to Braze.
> This page covers how to set up Cloud Data Ingestion support and sync relevant data from S3 to Braze.
You can use Cloud Data Ingestion (CDI) for S3 to directly integrate one or more S3 buckets in your AWS account with Braze. When new files are published to S3, a message is posted to SQS, and Braze Cloud Data Ingestion takes in those new files.

Expand All @@ -23,7 +23,7 @@ The integration requires the following resources:

## AWS definitions

First, let's just define some of the terms used during this task.
First, let's define some of the terms used during this task.

| Word | Definition |
| --- | --- |
Expand Down Expand Up @@ -55,7 +55,7 @@ Create an SQS queue to track when objects are added to the bucket you’ve creat
Be sure to create this SQS in the same region you created the bucket in.
{% endalert %}

Be sure to take note of the ARN and the URL of the SQS as you’ll be using it frequently during this configuration.
Make sure you take note of the ARN and the URL of the SQS as you’ll be using it frequently during this configuration.
<br><br>![]({% image_buster /assets/img/cloud_ingestion/s3_ARN.png %})
<br><br>

Expand Down Expand Up @@ -194,7 +194,7 @@ Unlike with data warehouse sources, the `UPDATED_AT` column is not required nor
{% endalert %}

{% alert note %}
Files added to the S3 source bucket should not exceed 512MB. Files larger than 512MB will result in an error and will not be synced to Braze.
Files added to the S3 source bucket should not exceed 512MB. Files larger than 512MB will result in an error and won't be synced to Braze.
{% endalert %}

{% tabs %}
Expand Down
Loading

0 comments on commit 39d7e9e

Please sign in to comment.