-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(samples): uses function (create_job) more appropriate to the described sample intent #1309
Conversation
There is a known bug that prevents pre-release tests from completing with versions of grpcio higher than 1.49rc0. |
samples/create_job.py
Outdated
# and to set optional job resource properties, if needed. | ||
# The job instance can be a LoadJob, CopyJob, ExtractJob, QueryJob | ||
# Here, we demonstrate a "query" job. | ||
# Reference: https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html#google.cloud.bigquery.client.Client.create_job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not going to render well with the long line of text on Sample Browser or even seeing this on GitHub. I like the link to the documentation though, could you perhaps add this link on https://cloud.google.com/bigquery/docs/samples/bigquery-create-job page instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I am confused: your link takes us to the page where this sample is displayed.
Is that intentional? That page does not currently provide additional information regarding the four types of jobs available.
Relatedly: in our renderings on the Sample Browser, is there a way to create standard hyperlinks within the code blocks displayed in the screen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It currently does not, but once your PR updates it the page will also be updated.
Tried sifting through https://googlecloudplatform.github.io/samples-style-guide/ and we currently don't provide any guidance on how hyperlinks can be should they be too long... Let me get back to you on this after I ask some folks around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the style guide a bit. thanks for linking to it. good reminder for me that it exists.
This https://googlecloudplatform.github.io/samples-style-guide/#clients item has a Python snippet with a really long URL in the code sample, just like the one I included.
¯_(ツ)_/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could remove the ending anchor (#google.cloud...
) to make the URL a bit smaller.
I recall that we have a g.co/cloud URL shortener for cloud.google.com pages (e.g. cloud.google.com/bigquery becomes g.co/cloud/bigquery, which isn't all that much shorter but we occasionally used it) I wonder if we could get g.co/bqpython or something similar pointing to the latest API reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I submitted a request for a short link:
g.co/bqpython > https://googleapis.dev/python/bigquery/latest/
It has to go through an approval process.
I would suggest that we not hold off on issuing this PR. Especially in light of the fact that even the style guide in code samples has examples of extremely long URLs, as noted in the comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into the shortlink process! I wasn't aware of such features. Hope it works out :)
I've asked the samples team for guidance, however it likely will take a long time for us to come up with a feasible solution, and will likely involve multiple teams. For now, it's adding more benefits so I'm happy to move forward as is.
samples/create_job.py
Outdated
# Here, we demonstrate a "query" job. | ||
# Reference: https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html#google.cloud.bigquery.client.Client.create_job | ||
# | ||
# NOTE: the preferred approach is to use one of the dedicated API calls: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on what "preferred approach" should be for?
If it's for executing the query it seems counterproductive to have a sample for create_client
if this is not the preferred method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tswast can you elaborate for us on this comment you made in the issue, as it relates to @dandhlee's question above?
"That section is about any kind of job, not just queries. As such, it should use the create_job method instead of the more specific query method. There should be comments that it is recommended to use the corresponding method for query/copy/load/extract."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One use-case this sample would be preferred to compared to one of the more specific examples is when folks try to retry failed jobs.
There are customers who use this method by iterating through a list of recent jobs and retrying any that have failed as a way to make their data pipelines a bit more robust.
One could also use this method as a way to create a job with an experimental API property that hasn't been added to the client library's manually written job configuration classes yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See specific comments and questions at various places in the code.
samples/create_job.py
Outdated
# and to set optional job resource properties, if needed. | ||
# The job instance can be a LoadJob, CopyJob, ExtractJob, QueryJob | ||
# Here, we demonstrate a "query" job. | ||
# Reference: https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html#google.cloud.bigquery.client.Client.create_job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I am confused: your link takes us to the page where this sample is displayed.
Is that intentional? That page does not currently provide additional information regarding the four types of jobs available.
Relatedly: in our renderings on the Sample Browser, is there a way to create standard hyperlinks within the code blocks displayed in the screen?
samples/create_job.py
Outdated
# Here, we demonstrate a "query" job. | ||
# Reference: https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html#google.cloud.bigquery.client.Client.create_job | ||
# | ||
# NOTE: the preferred approach is to use one of the dedicated API calls: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tswast can you elaborate for us on this comment you made in the issue, as it relates to @dandhlee's question above?
"That section is about any kind of job, not just queries. As such, it should use the create_job method instead of the more specific query method. There should be comments that it is recommended to use the corresponding method for query/copy/load/extract."
Co-authored-by: Dan Lee <71398022+dandhlee@users.noreply.github.com>
I added some justification for when/why Can you take a look at the changes I made and see if we are closer to the mark. Thanks. |
# client.extract_table() | ||
# client.copy_table() | ||
# client.load_table_file(), client.load_table_from_dataframe(), etc | ||
job_config={ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a link to https://cloud.google.com/bigquery/docs/reference/rest/v2/Job would be quite helpful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please see Tim's comment!
samples/create_job.py
Outdated
# and to set optional job resource properties, if needed. | ||
# The job instance can be a LoadJob, CopyJob, ExtractJob, QueryJob | ||
# Here, we demonstrate a "query" job. | ||
# Reference: https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html#google.cloud.bigquery.client.Client.create_job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into the shortlink process! I wasn't aware of such features. Hope it works out :)
I've asked the samples team for guidance, however it likely will take a long time for us to come up with a feasible solution, and will likely involve multiple teams. For now, it's adding more benefits so I'm happy to move forward as is.
…ple intent (googleapis#1309) * fix: uses function more appropriate to the described title * adds additional explanation for the end users * adds REST API URL for reference * corrects flake 8 linter errors * blackens file * adds type hints * avoids unreliable version of grpcio * updates imports to fix linting error * better method to avoid grpcio 1.49.0rc1 * Update samples/create_job.py Co-authored-by: Dan Lee <71398022+dandhlee@users.noreply.github.com> * adds further explanation on when/why to use create_jobs * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * updates references Co-authored-by: Dan Lee <71398022+dandhlee@users.noreply.github.com> Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
This is a work in progress.
The previous version simply used the
.query()
method, but the description and intent was to display the use of thecreate_job()
method. This migrates the code to using the intended method.Fixes #1085 🦕
BEGIN_COMMIT_OVERRIDE
docs(samples): uses function (create_job) more appropriate to the described sample intent
END_COMMIT_OVERRIDE