Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): add disconnect method #8341

Merged
merged 4 commits into from
Feb 14, 2024
Merged

Conversation

gforsyth
Copy link
Member

Description of changes

This adds a disconnect method to all backends. Previously we didn't do
this since the actual connection was often wrapped in SQLAlchemy and our
ability to terminate the connection was unclear.

The DB-API spec states that there should be a close method, and that
any subsequent operations on a given Connection should raise an error
after close is called.

This mostly works.

Trino, Clickhouse, Impala, and BigQuery do not conform to the DB-API in
this way. They have the close method but don't raise when you make a
subsequent call.

Spark is not a DB-API, but the spark.sql.session.stop method does the right thing.

For the in-process backends I've chosen to raise, since I don't think
there's a clear meaning on what closing that connection would mean, but
happy to take any suggestions there.

Issues closed

Resolves #5940

@gforsyth gforsyth added feature Features or general enhancements backends Issues related to all backends ci-run-cloud Add this label to trigger a run of BigQuery, Snowflake, and Databricks backends in CI labels Feb 13, 2024
@ibis-docs-bot ibis-docs-bot bot removed the ci-run-cloud Add this label to trigger a run of BigQuery, Snowflake, and Databricks backends in CI label Feb 13, 2024
@jcrist
Copy link
Member

jcrist commented Feb 13, 2024

For the in-process backends I've chosen to raise

Presuming by "in-process" you mean pandas/polars here and not duckdb/sqlite, I think the proper behavior here is a no-op rather than a raise. This lets users write more backend-agnostic code. We might also clear the stored tables mapping that the pandas/polars backends have to release memory, but 🤷 if that's worth it.

@gforsyth gforsyth force-pushed the disconnect branch 2 times, most recently from da85415 to c48d719 Compare February 13, 2024 22:21
Comment on lines 1491 to 1504
with pytest.raises(
(
DuckDBConnectionException,
ExaRuntimeError,
MySQLInterfaceError,
OracleInterfaceError,
PsycoPg2InterfaceError,
Py4JJavaError,
PyDruidError,
PyODBCProgrammingError,
SnowflakeDatabaseError,
sqlite3ProgrammingError,
)
):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know pytest.raises(Exception) is considered evil, but writing all of these out is running afoul of our (good) test environment isolation. Hmmmmm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using pytest.raises(Exception) seems fine to me honestly. We only care that things don't work, not necessarily why.

This adds a `disconnect` method to all backends.  Previously we didn't do
this since the actual connection was often wrapped in SQLAlchemy and our
ability to terminate the connection was unclear.

The DB-API spec states that there should be a `close` method, and that
any subsequent operations on a given `Connection` should raise an error
after `close` is called.

This _mostly_ works.

Trino, Clickhouse, Impala, and BigQuery do not conform to the DB-API in
this way.  They have the `close` method but don't raise when you make a
subsequent call.

For the in-process backends I've chosen to raise, since I don't think
there's a clear meaning on what closing that connection would mean, but
happy to take any suggestions there.
Copy link
Member

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

We may want to follow up with the non-SQL backends to clear their table mappings for consistency with the notion that disconnecting cleans up ephemeral resources created during the session. That said, we can probably wait until someone asks for that!

@gforsyth gforsyth enabled auto-merge (squash) February 14, 2024 15:42
@gforsyth gforsyth merged commit 32665af into ibis-project:main Feb 14, 2024
74 checks passed
@gforsyth gforsyth deleted the disconnect branch February 15, 2024 14:14
ncclementi pushed a commit to ncclementi/ibis that referenced this pull request Feb 21, 2024
## Description of changes

This adds a `disconnect` method to all backends. Previously we didn't do
this since the actual connection was often wrapped in SQLAlchemy and our
ability to terminate the connection was unclear.

The DB-API spec states that there should be a `close` method, and that
any subsequent operations on a given `Connection` should raise an error
after `close` is called.

This _mostly_ works.

Trino, Clickhouse, Impala, and BigQuery do not conform to the DB-API in
this way.  They have the `close` method but don't raise when you make a
subsequent call.

Spark is not a DB-API, but the `spark.sql.session.stop` method does the
right thing.

For the in-memory backends and Flink, this is a no-op as there is either 
nothing to disconnect or no exposed method to disconnect.



## Issues closed

Resolves ibis-project#5940
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backends Issues related to all backends feature Features or general enhancements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: implement disconnect method for backend connections
3 participants