Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Fix Flakey GBQ Tests #30630

Merged
merged 14 commits into from
Jan 3, 2020
Merged

Conversation

alimcmaster1
Copy link
Member

@alimcmaster1 alimcmaster1 commented Jan 2, 2020

-ref #30478 (comment)
We see the below in the logs:

google.api_core.exceptions.Conflict: 409 POST https://bigquery.googleapis.com/bigquery/v2/projects/pandas-travis/datasets: Already Exists: Dataset pandas-travis:pydata_pandas_bq_testing_py31

https://travis-ci.org/pandas-dev/pandas/jobs/631599036

Despite attempting to delete the dataset in the previous line.
self.client.delete_dataset(self.dataset, delete_contents=True)

Since we run with dist=loadfile these tests are run sequentially by pytest. But they could potentially clash across builds?

We now create a unique dataset name per test function and teardown when complete

GBQ Tests will run against my fork will post results on here.

cc. @jreback, @tswast

@WillAyd
Copy link
Member

WillAyd commented Jan 2, 2020

Hmm wouldn't this keep the dataset lingering around? Perhaps an alternate is to just split this into two fixtures and scope the dataset one to say the module level or even higher

@WillAyd WillAyd added IO Google Unreliable Test Unit tests that occasionally fail labels Jan 2, 2020
@alimcmaster1
Copy link
Member Author

Hmm wouldn't this keep the dataset lingering around? Perhaps an alternate is to just split this into two fixtures and scope the dataset one to say the module level or even higher

Sure if we don't want to keep the dataset - then an easy way is to give the dataset a random name and create/delete in this fixture. I've pushed the update.

@jreback jreback added this to the 1.0 milestone Jan 3, 2020

self.client = _get_client()
self.dataset = self.client.dataset(dataset_id)
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, are we supposed to clean up these datasets? @tswast

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now just do this here:

self.client.delete_dataset(self.dataset, delete_contents=True)

@jreback
Copy link
Contributor

jreback commented Jan 3, 2020

lgtm. very minor comment. ping on green.

@alimcmaster1
Copy link
Member Author

nitpick can you make this a module level function

Sure done.

Link to my travis fork where these tests have passed: (since they only run on our master branch)

@jreback
Copy link
Contributor

jreback commented Jan 3, 2020

great thanks @alimcmaster1

@alimcmaster1
Copy link
Member Author

alimcmaster1 commented Jan 3, 2020

Let me know if you see any more issues - thanks!

@jreback jreback merged commit 8105a7e into pandas-dev:master Jan 3, 2020
@jreback
Copy link
Contributor

jreback commented Jan 3, 2020

thanks @alimcmaster1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Unreliable Test Unit tests that occasionally fail
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants