-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docs on normalization naming rules #2576
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Feel free to merge once you address my comments!
- `cars` for the original parent table | ||
- `cars_da3_cars` for the expanded nested columns following this naming scheme in 3 parts: `<Parent prefix>_<Hash>_<nested column name>` | ||
|
||
1. Parent prefix: The entire json path string with '_' characters used as delimiters to reach the parent table name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should json path = json path to the nested column to be clearer?
|
||
Other modern data warehouses have much higher limits in terms of authorized name lengths so this should not be affecting us that often. | ||
However, in the rare cases where these limits should be reached, Basic Normalization will have to resort to fallback rules as follows: | ||
1. No Truncate if under destination's character limits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we leave this out since it describes a case where it's under the limit?
| Description | Example 1 | Example 2 | | ||
| :--- | :--- | :--- | | ||
| Original Stream Name | `companies` | `deals` | | ||
| Json path to the nested column | `companies/property_engagements_last_meeting_booked_campaign` | `deals/properties/engagements_last_meeting_booked_medium` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions:
*might be nice to show the truncated bits
- I don't fully understand the prefix truncation, maybe have an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The examples do include the prefix truncation
I added some colors in the bigquery results (non-truncated) to highlight the differences from the postgres ones (truncated)
Co-authored-by: Davin Chia <davinchia@gmail.com>
What
Describe how normalization choose names for nested tables