Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs on normalization naming rules #2576

Merged
merged 5 commits into from
Mar 23, 2021

Conversation

ChristopheDuong
Copy link
Contributor

What

Describe how normalization choose names for nested tables

Copy link
Contributor

@davinchia davinchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Feel free to merge once you address my comments!

docs/architecture/basic-normalization.md Outdated Show resolved Hide resolved
docs/architecture/basic-normalization.md Outdated Show resolved Hide resolved
- `cars` for the original parent table
- `cars_da3_cars` for the expanded nested columns following this naming scheme in 3 parts: `<Parent prefix>_<Hash>_<nested column name>`

1. Parent prefix: The entire json path string with '_' characters used as delimiters to reach the parent table name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should json path = json path to the nested column to be clearer?

docs/architecture/basic-normalization.md Outdated Show resolved Hide resolved
docs/architecture/basic-normalization.md Outdated Show resolved Hide resolved
docs/architecture/basic-normalization.md Outdated Show resolved Hide resolved
docs/architecture/basic-normalization.md Outdated Show resolved Hide resolved

Other modern data warehouses have much higher limits in terms of authorized name lengths so this should not be affecting us that often.
However, in the rare cases where these limits should be reached, Basic Normalization will have to resort to fallback rules as follows:
1. No Truncate if under destination's character limits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we leave this out since it describes a case where it's under the limit?

| Description | Example 1 | Example 2 |
| :--- | :--- | :--- |
| Original Stream Name | `companies` | `deals` |
| Json path to the nested column | `companies/property_engagements_last_meeting_booked_campaign` | `deals/properties/engagements_last_meeting_booked_medium` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions:
*might be nice to show the truncated bits

  • I don't fully understand the prefix truncation, maybe have an example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The examples do include the prefix truncation

I added some colors in the bigquery results (non-truncated) to highlight the differences from the postgres ones (truncated)

ChristopheDuong and others added 3 commits March 23, 2021 14:40
@ChristopheDuong ChristopheDuong merged commit 8013d43 into master Mar 23, 2021
@ChristopheDuong ChristopheDuong deleted the chris/normalization-docs branch March 23, 2021 14:26
davydov-d added a commit that referenced this pull request Nov 8, 2022
davydov-d added a commit that referenced this pull request Nov 10, 2022
davydov-d added a commit that referenced this pull request Nov 10, 2022
davydov-d added a commit that referenced this pull request Nov 10, 2022
davydov-d added a commit that referenced this pull request Nov 10, 2022
davydov-d added a commit that referenced this pull request Nov 10, 2022
* #2576 oncall. SAT: test spec against exposed secrets

* #2576 upd changelog

* #2576 review fixes

* #2576 more test fixes
akashkulk pushed a commit that referenced this pull request Dec 2, 2022
* #2576 oncall. SAT: test spec against exposed secrets

* #2576 upd changelog

* #2576 review fixes

* #2576 more test fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants