Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: resource leaks in client and %%bigquery magics #9790

Closed
tswast opened this issue Nov 13, 2019 · 2 comments · Fixed by #9894
Closed

BigQuery: resource leaks in client and %%bigquery magics #9790

tswast opened this issue Nov 13, 2019 · 2 comments · Fixed by #9894
Assignees
Labels
api: bigquery Issues related to the BigQuery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@tswast
Copy link
Contributor

tswast commented Nov 13, 2019

As pointed out in #9457, creating a GAPIC client and not closing the client's transport's channel before letting the client get garbage collected means we leak sockets / file descriptors.

Steps to reproduce

  1. Start a Jupyter Notebook (or launch the code example with ipython).
  2. Load the magics with %load_ext google.cloud.bigquery
  3. Run a %%bigquery magic command.
  4. Observe with psutil that open connections are not closed.

Code example

Notebook as Markdown:

import psutil

from google.cloud import bigquery
current_process = psutil.Process()
num_conns = len(current_process.connections())
print("connections before loading magics: {}".format(num_conns))
connections before loading magics: 12
%load_ext google.cloud.bigquery
num_conns = len(current_process.connections())
print("connections after loading magics: {}".format(num_conns))
connections after loading magics: 12
%%bigquery --use_bqstorage_api
SELECT
    source_year AS year,
    COUNT(is_male) AS birth_count
FROM `bigquery-public-data.samples.natality`
GROUP BY year
ORDER BY year DESC
LIMIT 15
year birth_count
0 2008 4255156
1 2007 4324008
2 2006 4273225
3 2005 4145619
4 2004 4118907
5 2003 4096092
6 2002 4027376
7 2001 4031531
8 2000 4063823
9 1999 3963465
10 1998 3945192
11 1997 3884329
12 1996 3894874
13 1995 3903012
14 1994 3956925
num_conns = len(current_process.connections())
print("connections after running magics: {}".format(num_conns))
connections after running magics: 16
%%bigquery --use_bqstorage_api
SELECT
    source_year AS year,
    COUNT(is_male) AS birth_count
FROM `bigquery-public-data.samples.natality`
GROUP BY year
ORDER BY year DESC
LIMIT 15
year birth_count
0 2008 4255156
1 2007 4324008
2 2006 4273225
3 2005 4145619
4 2004 4118907
5 2003 4096092
6 2002 4027376
7 2001 4031531
8 2000 4063823
9 1999 3963465
10 1998 3945192
11 1997 3884329
12 1996 3894874
13 1995 3903012
14 1994 3956925
num_conns = len(current_process.connections())
print("connections after running magics: {}".format(num_conns))
connections after running magics: 20

Full example:

import psutil

from google.cloud import bigquery

current_process = psutil.Process()
num_conns = len(current_process.connections())
print("connections before loading magics: {}".format(num_conns))

get_ipython().run_line_magic('load_ext', 'google.cloud.bigquery')

num_conns = len(current_process.connections())
print("connections after loading magics: {}".format(num_conns))

get_ipython().run_cell_magic('bigquery', '--use_bqstorage_api', 'SELECT\n    source_year AS year,\n    COUNT(is_male) AS birth_count\nFROM `bigquery-public-data.samples.natality`\nGROUP BY year\nORDER BY year DESC\nLIMIT 15')

num_conns = len(current_process.connections())
print("connections after running magics: {}".format(num_conns))

get_ipython().run_cell_magic('bigquery', '--use_bqstorage_api', 'SELECT\n    source_year AS year,\n    COUNT(is_male) AS birth_count\nFROM `bigquery-public-data.samples.natality`\nGROUP BY year\nORDER BY year DESC\nLIMIT 15')

num_conns = len(current_process.connections())
print("connections after running magics: {}".format(num_conns))

Stack trace

N/A

Suggested fix

As identified in #9457, we need to close the bqstorage_client.transport.channel, since we create a new BQ Storage client each time.

I suggest we also add psutil as a test-only dependency and verify in a system test of google.cloud.bigquery.magics._cell_magic that there are no additional open connections after running the cell magic.

@tswast tswast added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. api: bigquery Issues related to the BigQuery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Nov 13, 2019
@tswast
Copy link
Contributor Author

tswast commented Nov 13, 2019

I observe 4 connections leaked when the BigQuery Storage API is used, but also 2 connections leaked without it. I verified that no GAPIC client is created without it, so I'm not sure where these are coming from.

Edit: It seems that this resource leak applies also to our handwritten BigQuery client.

import psutil
from google.cloud import bigquery

current_process = psutil.Process()

num_conns = len(current_process.connections())
print("connections before creating client {}".format(num_conns))

Output:

connections before creating client 12
client = bigquery.Client()

num_conns = len(current_process.connections())
print("connections after creating client: {}".format(num_conns))

Output:

connections after creating client: 12
table = client.get_table("bigquery-public-data.samples.natality")

num_conns = len(current_process.connections())
print("connections after getting table: {}".format(num_conns))

Output:

connections after getting table: 14
job = client.query(
"""
SELECT
    source_year AS year,
    COUNT(is_male) AS birth_count
FROM `bigquery-public-data.samples.natality`
GROUP BY year
ORDER BY year DESC
LIMIT 15
""")

num_conns = len(current_process.connections())
print("connections after starting query: {}".format(num_conns))

Output:

connections after starting query: 14
row_iter = job.result()

num_conns = len(current_process.connections())
print("connections after waiting for query: {}".format(num_conns))

Output:

connections after waiting for query: 14
rows = list(row_iter)

num_conns = len(current_process.connections())
print("connections after downloading query results: {}".format(num_conns))

Output:

connections after downloading query results: 14
del client

num_conns = len(current_process.connections())
print("connections after deleting client: {}".format(num_conns))

Output:

connections after deleting client: 14

@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Nov 18, 2019
@plamut
Copy link
Contributor

plamut commented Nov 25, 2019

For the second case - it seems that one socket is opened by the BigQuery client's internal transport object (client._http), and the other by the latter's own internal google.auth.transport.requests.AuthorizedSession instance that refreshes the auth token (if necessary) before sending the actual API request.

The following closes both sockets:

client._http._auth_request.session.close()
client._http.close()

It's not user-friendly, of course, thus we need to add a convenience method for it.

@tswast tswast changed the title BigQuery: resource leak when --use_bqstorage_api is used in %%bigquery magics BigQuery: resource leaks in %%bigquery magics Nov 26, 2019
@plamut plamut changed the title BigQuery: resource leaks in %%bigquery magics BigQuery: resource leaks in client and %%bigquery magics Nov 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants