Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

table.upsert/upsert_all fails to write row when not_null is absent and the schema definition includes not_null #610

Open
alexwlchan opened this issue Jan 9, 2024 · 3 comments

Comments

@alexwlchan
Copy link

alexwlchan commented Jan 9, 2024

I found a bug where calls to upsert() and upsert_all() don't write rows if:

  1. There's a not_null kwarg on the table definition, and
  2. You don't pass not_null in the upsert() call

This doesn't affect insert() or insert_all().

Repro example

from sqlite_utils import Database

db = Database(":memory:")

db["birds"].create(
    {"id": int, "name": str},
    pk="id",
    not_null={"name"},
)

db["birds"].insert({"id": 1, "name": "flamingo"})
print("insert:               ", len(list(db["birds"].rows)))

db["birds"].insert({"id": 2, "name": "goldfinch"}, not_null={"name"})
print("insert, not_null:     ", len(list(db["birds"].rows)))

db["birds"].upsert({"id": 3, "name": "loon"}, pk="id")
print("upsert:               ", len(list(db["birds"].rows)))

db["birds"].upsert({"id": 4, "name": "stork"}, pk="id", not_null={"name"})
print("upsert, not_null:     ", len(list(db["birds"].rows)))

db["birds"].insert_all([{"id": 5, "name": "magpie"}])
print("insert_all:           ", len(list(db["birds"].rows)))

db["birds"].insert_all([{"id": 6, "name": "gull"}], not_null={"name"})
print("insert_all, not_null: ", len(list(db["birds"].rows)))

db["birds"].upsert_all([{"id": 7, "name": "dodo"}], pk="id")
print("upsert_all:           ", len(list(db["birds"].rows)))

db["birds"].upsert_all([{"id": 8, "name": "crow"}], pk="id", not_null={"name"})
print("upsert_all, not_null: ", len(list(db["birds"].rows)))

Here's the output (with my comments on the right)

sqlite-utils, version 3.36
insert:                1
insert, not_null:      2
upsert:                2    <-- no write
upsert, not_null:      3
insert_all:            4
insert_all, not_null:  5
upsert_all:            5    <-- no write
upsert_all, not_null:  6

So calling upsert() and upsert_all() without passing not_null isn't actually inserting a record.

This problem goes away if:

  • I add the not_null argument to the upsert() calls, or
  • Remove the not_null from the table definition

Version info

  • Python: 3.12.0
  • sqlite-utils: 3.36
  • sqlite: 3.39.5 2022-10-14

Related issues

It seems very likely that this is related to #538, which was also about upsert() and the not_null flag.

@alexwlchan
Copy link
Author

I don't know if this is a clue, but while testing this I noticed that insert() seems to be completely ignoring the not_null parameter if it doesn't have to create the table, whereas upsert() doesn't. Here's another example:

from sqlite_utils import Database

db = Database(":memory:")

db["birds"].create(
    {"id": int, "name": str},
    pk="id",
    not_null={"name"},
)

db["birds"].insert({"id": 1, "name": "flamingo"}, not_null={"wingspan"})
print("insert complete!")

db["birds"].upsert({"id": 2, "name": "goldfinch"}, pk="id", not_null={"wingspan"})
print("upsert complete!")

In this case the insert() completes, but the upsert() fails with sqlite3.OperationalError: table birds has no column named wingspan.

The only way to make the insert() fail is to let it auto-create the table:

from sqlite_utils import Database

db = Database(":memory:")

db["birds"].insert({"id": 1, "name": "flamingo"}, not_null={"wingspan"})
print("insert complete!")

db["birds"].upsert({"id": 2, "name": "goldfinch"}, pk="id", not_null={"wingspan"})
print("upsert complete!")

at which point I get an error AssertionError: not_null set {'wingspan'} includes items not in columns {'id', 'name'}.

@alexwlchan
Copy link
Author

A further, potentially interesting observation: this only seems to apply when you're inserting new rows. If your upsert is modifying an existing row, it works without supplying the not_null list.

Another example:

from sqlite_utils import Database

db = Database(":memory:")

db["birds"].create(
    {"id": int, "name": str, "color": str},
    pk="id",
    not_null={"name"},
)

db["birds"].insert({"id": 1, "name": "flamingo"})
print(next(db["birds"].rows))
# initial insert
# {'id': 1, 'name': 'flamingo', 'color': None}

db["birds"].upsert({"id": 1, "name": "goldfinch"}, pk="id")
print(next(db["birds"].rows))
# modifying the existing row
# {'id': 1, 'name': 'goldfinch', 'color': None}

db["birds"].upsert({"id": 1, "color": "blue"}, pk="id")
print(next(db["birds"].rows))
# modifying a column which is allowed to be non-null
# {'id': 1, 'name': 'goldfinch', 'color': 'blue'}

(That second case is how I spotted it – I was upserting into a table with not-null columns, but modifying a nullable column on an existing row.)

@petergaultney
Copy link

i can confirm all of the above.

upserts do not work on tables where there are not-null columns, but only the inserts do not work - updates work fine.

i am not sure if there would be a better fix, but one possible fix would be to query sqlite_master if not_null is DEFAULT, and if the table already exists, use its own reckoning of which columns are not_null. I am going to implement this fix in our fork.

One other thing worth noting - the documentation asserts that the pk argument is not necessary for upserts if you are certain that the table has already been created. However, that is not the case - upsert will raise an exception if you do not provide pk. My proposed fix of using sqlite_master to get the actual 'values' for not_null would also work as a way of filling in pk for existing tables, to match the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants