-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(duckdb): fix create_table() in databases with spaces in the name #9817
Conversation
7188806
to
630e4c0
Compare
There's always things to fix involving quoting, but I think this should be scoped as per the title, to connecting to database files that have spaces in the name. I don't think the clickhouse change is relevant here. This is going to have to be a backend-specific test, I think, since creating database files is going to be inherently backend specific. |
@gforsyth thanks for the reply. Sorry, I think I need a little more handholding :) Can you point me to which files the test should be added to? I'm thinking something like @pytest.mark.paramaterize(
"database",
[
"with spaces",
("spaced catalog", "spaced database"),
("spaced catalog.spaced database"),
]
)
@pytest.mark.never(
["clickhouse", ], reason="doesn't have single database files"
)
def test_create_table_quoting(conn, database):
t = conn.create_table("foo", {"a": [0,1,2]}, database=database)
result = set(conn.execute(t.foo))
assert result == {1,2,3} ? Your clickhouse comment went over my head, I'm not that familiar with SQL engines so the distinction between a "database file", "catalog", and "database" are fuzzy for me. |
Hey @NickCrews -- sure, it's fuzzy for me, too, because there's limited consensus on what any of these words mean. Here goes:
Now, for the fix you're proposing [ins] In [14]: con = ibis.duckdb.connect("test.db")
[ins] In [15]: con.list_catalogs()
Out[15]: ['system', 'temp', 'test']
[ins] In [16]: con = ibis.duckdb.connect("test.ddb.db")
[ins] In [17]: con.list_catalogs()
Out[17]: ['system', 'temp', 'test'] In some backends (uhh, But you can't do that with DuckDB because you can't create a So you can't use the Instead, I recommend putting a test in So something like @pytest.mark.paramaterize(
"database_file",
[
"with spaces.ddb",
"space catalog.duckdb.db",
]
)
def test_create_table_quoting(database_file):
conn = ibis.duckdb.connect(database_file)
t = conn.create_table("foo", {"a": [0,1,2]})
result = set(conn.execute(t.foo))
assert result == {1,2,3} That is going to leave a file on-disk, though, so you should probably use the pytest And to my point about Clickhouse, this test can only run on DuckDB, because it's the only one that uses the file-name to generate the catalog name. If this is a problem with other backends, there are few-enough backends that allow creating catalogs at all that they are probably better handled individually, with backend-specific tests. |
…ces in the name If you tried ```python import ibis conn = ibis.duckdb.connect("with space.duckdb") conn.create_table("foo", {"a": [0,1,2]}) ``` then you would get the DDL `CREATE TABLE with space.main."foo" ("a" BIGINT)`. Now, the database name is quoted.
630e4c0
to
f1a46d7
Compare
got it, thanks @gforsyth , that makes sense. The duckdb test was quite easy to implement. Ibis's terminology for the namespace hierarchy is unoverloaded and quite clear, I think it is good. I left in the clickhouse implementation fix, even though I didn't add a test for it. Let me know if you want me to remove it (though that seems silly to me, this new version looks strictly better and I highly doubt is hiding anything) Side note: I don't like how I only added this test to duckdb, but I wouldn't want each test_client.py in each backend to implement it separately either. Too easy for the semantics to drift between them. Ideally there was some central test_catalog_can_have_spaces(), where you could see all the backends that we had marked as notimpl, never, etc, and then this dispatched out to individual backends to actually implement the test. |
It has to live in the duckdb-specific test file to test DuckDB -- the others could maybe be handled by creating a catalog in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you tried
then you would get the DDL
CREATE TABLE with space.main."foo" ("a" BIGINT)
. Now, the database name is quoted.I went through the create_table() method in all the backends and looked for where we forgot to use "quoted", but I almost definitely missed some cases.
I need to add a test for this, but I would love your help with this:
con
fixture to sometimes be a database with spaces in the name?