Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document production BigQuery data layout #302

Merged
merged 7 commits into from
Aug 23, 2019
Merged

Document production BigQuery data layout #302

merged 7 commits into from
Aug 23, 2019

Conversation

jklukas
Copy link
Contributor

@jklukas jklukas commented Aug 9, 2019

This PR adds a link to the evolving Data Access Continuity Guide and documents the layout and naming of datasets and tables in the reconfigured shared-prod GCP project.

What is written here is not yet strictly true. Merging this PR will have to wait on full promotion of the GCP ingestion pipeline to prod, which will hopefully happen next week. It also depends on moving derived tables to shared-prod, which will also hopefully happen next week.

Viva Doc Days!

@jklukas
Copy link
Contributor Author

jklukas commented Aug 19, 2019

Also see draft email to be sent to fx-data-dev

Copy link
Contributor

@tdsmith tdsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had several questions reading the transition docs today, and this page (as amended by this PR) answered them; thanks!

| |`tmp`|Temporary staging area for parquet data loads|
| |`udf` |Persistent user-defined functions defined in SQL|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this link to https://github.com/mozilla/bigquery-etl/tree/master/udf or a UDF documentation page?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

| |`tmp`|Temporary staging area for parquet data loads|
| |`udf` |Persistent user-defined functions defined in SQL|
| |`udf_js` |Persistent user-defined functions defined in JavaScript|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this link to https://github.com/mozilla/bigquery-etl/tree/master/udf_js or a UDF documentation page?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #312 for the long term solution but I think these links would be useful in the interim.

@jklukas
Copy link
Contributor Author

jklukas commented Aug 20, 2019

@tdsmith Thank you very much for reading and commenting. I've updated to provide additional context on UDFs and to link to the source as you suggested.

src/cookbooks/bigquery.md Outdated Show resolved Hide resolved
src/datasets/ping_intro.md Outdated Show resolved Hide resolved
Copy link
Contributor

@mreid-moz mreid-moz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@jklukas jklukas merged commit 86e27c0 into master Aug 23, 2019
@jklukas jklukas deleted the bq-data-layout branch August 23, 2019 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants