Update violates set semantics #2134

kim · 2025-01-17T09:35:52Z

Using the update method on a #[primary_key]'ed table inserts a new row even if the row is unchanged.
This prevents the database to restart from the commitlog:

internal error: failed to get or launch module host: Error inserting row ProductValue { elements: [U64(1), U64(1)] } during transaction 6 playback: TableError: Duplicate insertion of row RowPointer(r: 0, pi: 0, po: 0, so: 1) violates set semantics: Duplicate insertion of row RowPointer(r: 0, pi: 0, po: 0, so: 1) violates set semantics

To reproduce, create a module like so:

use spacetimedb::{ReducerContext, Table};

#[spacetimedb::table(name = boeuf)]
pub struct Boeuf {
    #[primary_key]
    pub id: u64,
    pub n: u64,
}

#[spacetimedb::reducer]
pub fn add(ctx: &ReducerContext, id: u64, n: u64) {
    match ctx.db.boeuf().id().find(id) {
        Some(row) => ctx.db.boeuf().id().update(Boeuf { n, ..row }),
        None => ctx.db.boeuf().insert(Boeuf { id, n }),
    };
}

Publish the module, then call the add reducer twice with the same arguments.
Restart the server
Access the database, e.g. by running an SQL statement or calling the reducer again
Observe above mentioned error

The text was updated successfully, but these errors were encountered:

gefjon · 2025-01-17T15:01:31Z

Reduced repro to:

use spacetimedb::{ReducerContext, Table};

#[spacetimedb::table(name = boeuf)]
pub struct Boeuf {
    #[primary_key]
    pub id: u64,
    pub n: u64,
}

#[spacetimedb::reducer]
pub fn add(ctx: &ReducerContext, id: u64, n: u64) {
    ctx.db.boeuf().id().delete(id);
    ctx.db.boeuf().insert(Boeuf { id, n });
}

spacetime start &
spacetime publish kim-test
spacetime call kim-test add 1 1
spacetime call kim-test add 1 1
fg
^C
spacetime start &
spacetime sql kim-test 'select * from boeuf'

Unsurprising, since update is delete followed by insert, but still potentially interesting.

Shubham8287 · 2025-01-19T05:20:22Z

+1, this is failing replication cluster restart smoketest

kim · 2025-01-20T12:21:33Z

I'm marking this P0.

cloutiertyler · 2025-01-21T02:22:13Z

@kim could you post the result of your bisect here even if you don't believe it was correct?

kim · 2025-01-21T05:35:04Z

f136670

kim · 2025-01-22T11:01:48Z

The issue is not fully fixed, potentially due to special handling of connect/disconnect transactions.

imho, replaying should only warn, but never error on set semantic violations.

kim · 2025-01-22T12:30:31Z

Steps to reproduce:

run a standalone server
publish the quickstart-chat example
run the quickstart-chat example and exit it
This will only create connect/disconnect transactions.
stop the server
remove the snapshot of the quickstart-chat replica
start the server

Observe:

2025-01-22T12:16:13.955109Z  INFO try_init: crates/core/src/db/relational_db.rs:311: [c2004ab29bcd85ef4bb8d9a5f974b40a5261ba4d5b568c5c940c592eaea38a67] DATABASE: durable_tx_offset is Some(3)    
2025-01-22T12:16:13.955199Z  INFO try_init: crates/core/src/db/relational_db.rs:462: [c2004ab29bcd85ef4bb8d9a5f974b40a5261ba4d5b568c5c940c592eaea38a67] DATABASE: no snapshot on disk    
2025-01-22T12:16:13.958685Z  INFO try_init: crates/core/src/db/relational_db.rs:1207: [c2004ab29bcd85ef4bb8d9a5f974b40a5261ba4d5b568c5c940c592eaea38a67] DATABASE: applying transaction history...    
2025-01-22T12:16:13.958800Z  WARN try_init: /tank/src/clockworklabs/spacetimedbprivate/public/crates/commitlog/src/commitlog.rs:649: commitlog offset index is not used at segment 0: Asked key is smaller than the first entry in the index    
2025-01-22T12:16:13.960646Z  INFO try_init: crates/core/src/db/relational_db.rs:1221: [c2004ab29bcd85ef4bb8d9a5f974b40a5261ba4d5b568c5c940c592eaea38a67] Loaded 100% (3/3)    
2025-01-22T12:16:13.960801Z ERROR try_init: crates/core/src/host/host_controller.rs:695: error=Error deleting row ProductValue { elements: [U256(87750066309256334258040029134888392372784772464136912353719315292725071624392), U128(Packed(334798706132543748226464558050192239161))] } during transaction 3 playback
2025-01-22T12:16:13.960865Z ERROR /tank/src/clockworklabs/spacetimedbprivate/public/crates/client-api/src/lib.rs:342: internal error: failed to get or launch module host: Error deleting row ProductValue { elements: [U256(87750066309256334258040029134888392372784772464136912353719315292725071624392), U128(Packed(334798706132543748226464558050192239161))] } during transaction 3 playback: Delete for non-existent row when replaying transaction

The offending transaction 3 is another identity_disconnected call.

This is explicitly allowed because we cannot guarantee the pairing connect/disconnect transactionally.

Replaying must support this, either by special-casing connect/disconnect, or generally not erroring due to set semantic violations.
I would prefer the latter.

gefjon · 2025-01-22T15:28:48Z

Replaying must support this, either by special-casing connect/disconnect, or generally not erroring due to set semantic violations. I would prefer the latter.

To be clear, what should the behavior be in the case of an insert of an already-present row in a commitlog? Should the second insert:

Be processed, resulting in two copies of the row in the in-memory state?
Be ignored, resulting in a single copy of the row in the in-memory state?

Should we similarly ignore deletes of not-present rows?

I will also point out that, due to our rejecting duplicate rows during commitlog replay, we identified and fixed an actual bug in the datastore which we might otherwise not have noticed. If removing that protection is the only or best option, so be it, but there is a cost to it.

kim · 2025-01-22T17:11:24Z

due to our rejecting duplicate rows during commitlog replay, we identified and fixed an actual bug in the datastore

That’s nice, but it would have resulted in prolonged unavailability if it was caught in production. Since no automated testing caught it, this could very well have happened.

I would prefer the replay to be as lenient as possible, ignoring duplicate inserts (leaving the first row in memory) and deletes (the row is already gone).

I would also accept a solution which either special-cases connect/disconnect transactions, or revises the implementation and requirements to not require a potentially-duplicate disconnect.

gefjon · 2025-01-23T15:09:20Z

An update on this: there was no duplicate insert, there was a correct and consistent commitlog containing insert/delete pairs as it should have. We were unable to replay due to a bug in the datastore, described in #2161 . Kim's assertion that it is safe to ignore a failed delete because "the row is already gone" did not hold in this case; the row would still be present after the failed delete. If we had been "as lenient as possible," we would have reconstructed a state that was objectively incorrect, in that it would contain rows which according to the commitlog should have been deleted. I believe quite strongly that bringing up a database in the wrong state in this way would be worse than erroring loudly.

gefjon · 2025-01-24T15:30:25Z

Closed by #2161 and #2156 .

cloutiertyler assigned Centril Jan 20, 2025

Centril mentioned this issue Jan 21, 2025

Fix unique index + MutTxId::insert not un-deleteting #2156

Merged

Centril closed this as completed in #2156 Jan 22, 2025

kim reopened this Jan 22, 2025

gefjon closed this as completed Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update violates set semantics #2134

Update violates set semantics #2134

kim commented Jan 17, 2025 •

edited

Loading

gefjon commented Jan 17, 2025

Shubham8287 commented Jan 19, 2025

kim commented Jan 20, 2025

cloutiertyler commented Jan 21, 2025

kim commented Jan 21, 2025

kim commented Jan 22, 2025

kim commented Jan 22, 2025

gefjon commented Jan 22, 2025

kim commented Jan 22, 2025

gefjon commented Jan 23, 2025

gefjon commented Jan 24, 2025

Update violates set semantics #2134

Update violates set semantics #2134

Comments

kim commented Jan 17, 2025 • edited Loading

gefjon commented Jan 17, 2025

Shubham8287 commented Jan 19, 2025

kim commented Jan 20, 2025

cloutiertyler commented Jan 21, 2025

kim commented Jan 21, 2025

kim commented Jan 22, 2025

kim commented Jan 22, 2025

gefjon commented Jan 22, 2025

kim commented Jan 22, 2025

gefjon commented Jan 23, 2025

gefjon commented Jan 24, 2025

kim commented Jan 17, 2025 •

edited

Loading