Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added collection of Managing Dataset samples #6

Merged
merged 13 commits into from
Apr 8, 2019

Conversation

lbristol88
Copy link

# TODO(developer): Set model_id to the ID of the model to fetch.
# dataset_id = 'your-project.your_dataset'

client.delete_dataset(dataset_id)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original sample also had an example showing the delete_contents=True option. I think you could actually do the same, but also show off not_found_ok=True.

client.delete_dataset(dataset_id)

# Comment about deleting a table that contains tables.
# Comment about tables that might not exist.
client.delete_dataset(dataset_id, delete_contents=True, not_found_ok=True)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! Added the parameters and notes about what they do.

)

# View dataset properties
print("Dataset ID: {}".format(dataset_id))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delete this line since you're printing the ID in the same sample in the print statement above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

create_dataset.create_dataset(client, random_dataset_id)
out, err = capsys.readouterr()
assert "Created dataset {}".format(random_dataset_id) in out

# get dataset
get_dataset.get_dataset(client, random_dataset_id)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should all be separate tests. That way we can more easily tell when an individual sample is broken. Models API is an exception because it's so slow to create one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All separated out into their own tests. Had to include the create_dataset function for some of the tests or else they wouldn't pass (needs a dataset to make changes to!).

bigquery/samples/update_dataset_access.py Show resolved Hide resolved

def update_dataset_default_table_expiration(client, dataset_id):

# [START bigquery_update_dataset_default_table_expiration]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be removed from docs/snippets.py.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please keep the "region tag" the same. That way we can track changes over time better. It was called bigquery_update_dataset_expiration in snippets.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, but forgot to push /facepalm. Updated now!

Updated the region tag as directed.

) # API request

full_dataset_id = "{}.{}".format(dataset.project, dataset.dataset_id)
print("Updated dataset {}".format(full_dataset_id))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testing purposes, it would be good to print the new dataset.default_table_expiration_ms also.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the expiration to the print statement.

client, random_dataset_id
)
out, err = capsys.readouterr()
assert "Updated dataset {}".format(random_dataset_id) in out
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please look for the expected expiration str(24 * 60 * 60 * 1000) in the output too. That way we can be more certain that the expiration was actually updated as expected.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the time into it's own fixture so it's easier to be used in the test.

@@ -25,6 +25,12 @@ def client():
return bigquery.Client()


@pytest.fixture
def one_day_ms(client):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be a fixture. Just a normal variable/constant is fine. Fixtures are more for setup & clean-up code.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now that I see where this is used in the sample code, I'd prefer this code stay in the sample. That way it's clear to someone reading the sample what expected values for expiration look like.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took this out of the fixture and put it back into the sample and test.

from .. import delete_dataset


def test_delete_dataset(capsys, client, random_dataset_id):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend you use the dataset_id fixture instead of random_dataset_id, since the dataset_id fixture will create a dataset for you (to then delete in the sample).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now updated.

from .. import get_dataset


def test_get_dataset(capsys, client, random_dataset_id):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. I recommend you use the dataset_id fixture instead of random_dataset_id, since the dataset_id fixture will create a dataset for you (to then get in the sample).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now updated.

from .. import list_datasets


def test_list_datasets(capsys, client, random_dataset_id):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The random_dataset_id fixture isn't necessary for this test.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

from .. import update_dataset_access


def test_update_dataset_access(capsys, client, random_dataset_id):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. I recommend you use the dataset_id fixture instead of random_dataset_id, since the dataset_id fixture will create a dataset for you (to then update in the sample).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with dataset_id as directed.



def test_update_dataset_default_table_expiration(
capsys, client, random_dataset_id, one_day_ms
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. I recommend you use the dataset_id fixture instead of random_dataset_id, since the dataset_id fixture will create a dataset for you (to then update in the sample).

one_day_ms should just be a constant in this file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated dataset_id and added the constant for one_day_ms from the fixture.

# dataset_id = 'your-project.your_dataset'

dataset = client.get_dataset(dataset_id)
dataset.default_table_expiration_ms = one_day_ms
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's against DRY principles, but In sample code it's better to actually show the actual value rather than create a constant. That way people can see typical values and understand that an integer is expected.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took out the extra variable as directed.

Copy link
Owner

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@tswast tswast merged commit e1bb09c into tswast:bq-snippets Apr 8, 2019
tswast pushed a commit that referenced this pull request Apr 9, 2019
* Added delete dataset function

* Added get dataset function

* Added list dataset function

* Added update dataset description sample

* Added update dataset access sample

* Added update dataset table expiration sample

* Added tests for dataset samples and updated docs

* Removing original update dataset access from snippets file.

* Moved all dataset tests into own file. Made changes based on feedback.

* Made changes based on feedback

* Removed unnecessary use of random_dataset_id in tests and removed one_day_ms fixture

* Removed unnecessary constant

* Stored the math as a constant to make it look cleaner.
tswast pushed a commit that referenced this pull request Apr 10, 2019
* Added delete dataset function

* Added get dataset function

* Added list dataset function

* Added update dataset description sample

* Added update dataset access sample

* Added update dataset table expiration sample

* Added tests for dataset samples and updated docs

* Removing original update dataset access from snippets file.

* Moved all dataset tests into own file. Made changes based on feedback.

* Made changes based on feedback

* Removed unnecessary use of random_dataset_id in tests and removed one_day_ms fixture

* Removed unnecessary constant

* Stored the math as a constant to make it look cleaner.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants