-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(mssql): support case-sensitive collations #9700
Conversation
Brain is hurting at the prospect of debugging why the 3.10 job passed but not the 3.12 one 🧠 |
Well, locally 3.12.4 (the same version as in CI) is all green for me with this PR 🤔 |
Deleting my sqlserver image, because ms is known to alter the image under that tag without notice. |
Re-running CI, to see if that was a fluke run. Not sure if a successful run is more concerning, but we'll have some additional information! |
I'm also having some weird behavior locally. Initially things were green on my local, now they're not. I might add in a few checks to ensure that the collation is being respected |
well... it succeeded this time... |
I'm marking this as a draft until I can figure out how to make our testing setup behave deterministically w.r.t. collation |
Wow I hate everything. We use a docker health-check in the mssql doesn't set collation until it's done restoring/creating various backups, so if the health check fires before the collation is set, then it gets set to the default collation, and you can't change that collation without removing the volume. |
Thanks for tracking that down. @ncclementi and I observed this behavior but struggled to get to the bottom of it in a reasonable amount of time. What is the MS recommended sequence of steps for creating a database with a specific collation such that it isn't defeated by some other MS SQL process? |
I really don't know. Right now my only plan is to delay the health check... |
This is a semi-brittle fix to the collation race condition, but it seems to work consistently on my machine and CI? |
The default collation in MSSQL is SQL_Latin1_General_CP1_CI_AS where the CI stands for Case Insensitive. We use a case-sensitive collation for testing so that we don't break users with case-sensitive collations.
This updates a few internal table and view calls to use the case-sensitive names which will allows us to (hopefully) support both case-sensitive and case-insensitive collations. The correct casing for the tables and views we use often (and the corresponding columns): Info schema tables: - INFORMATION_SCHEMA.COLUMNS - INFORMATION_SCHEMA.SCHEMATA - INFORMATION_SCHEMA.TABLES Temp table location: tempdb.dbo Catalogs: sys.databases Databases: sys.schemas I also had to swap out a stored procedure that _really_ didn't like being used with the case-sensitive collation, but I think the replacement is nicer anyway (we can push down some column selection to the engine).
Yes this is incredibly gross, but even specifying the collation in the `CREATE DATABASE` call doesn't work reliably.
aead83b
to
a4ca276
Compare
I have an idea: let's try to create database as part of the |
Ok, I think I might have something working! |
def _load_data(self, *, database: str = IBIS_TEST_MSSQL_DB, **_): | ||
with self.connection._safe_raw_sql( | ||
"IF DB_ID('ibis_testing') is NULL BEGIN CREATE DATABASE [ibis_testing] END" | ||
): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I was getting cryptic user errors when I tried this, but I must've made a syntax error somewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably was the other thing I had to change (if the issue was connecting), database=None
doesn't do what you might expect when connecting.
Locally all passing with
|
@@ -35,11 +36,23 @@ | |||
("DATETIME", dt.Timestamp(scale=3)), | |||
# Characters strings | |||
("CHAR", dt.string), | |||
("TEXT", dt.string), | |||
param( | |||
"TEXT", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know that the TEXT
type doesn't support modern text.
This seems like a bug fix or a feature given that a thing that was previously not working is now working, also 🥳 for that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, ship it!
This should enable support for users working with Microsoft Fabric Warehouse
Description of changes
This updates a few internal table and view calls to use the
case-sensitive names which will allows us to (hopefully) support both
case-sensitive and case-insensitive collations.
The correct casing for the tables and views we use often (and the
corresponding columns):
Info schema tables:
Temp table location: tempdb.dbo
Catalogs: sys.databases
Databases: sys.schemas
I also had to swap out a stored procedure that really didn't like
being used with the case-sensitive collation, but I think the
replacement is nicer anyway (we can push down some column selection to
the engine).
Issues closed