You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently if IMPORT INTO fails, it leaves whatever data was ingested in the table, making it unclear what was and wasn't written. Combined with the fact that initially it will not allow overwriting, this makes retrying impossible if anything was written.
On failure, IMPORT INTO should rollback whatever it wrote.
How this will work in practice is a little harder. It looks like we'll need a time-bound clear of all the keys it wrote, which we can get by adding a time-bound option to ClearRange. (some fancy o(1) metadata update that was then checked during MVCC reads could work, it has been judged too complicated to get right when proposed in the past). We can then use this to rollback since we know that that as of time X, which we'll use on all the keys in the IMPORT INTO SSTs, the table was offline and no other writes could have happened, so a delete of all keys >= X will rollback the IMPORT and just the IMPORT.
If or when IMPORT INTO is allowed to overwrite keys, we'll additionally need to prevent GC of the overwritten key until we're sure we will not rollback, because if we shadowed e.g. a year old key, it could be GC'ed more or less immediately, making it impossible to roll back to it by deleting the key that shadowed it. To do this we'll likely want to add some flag to the ingestion of such "provisional" data that prevents GC on that range until a later RPC says that we will not try to rollback. For now though, with the intial version disallowing any key overwrites, this is a non-issue.
The text was updated successfully, but these errors were encountered:
(part of the larger effort documented in #26834 )
Currently if IMPORT INTO fails, it leaves whatever data was ingested in the table, making it unclear what was and wasn't written. Combined with the fact that initially it will not allow overwriting, this makes retrying impossible if anything was written.
On failure, IMPORT INTO should rollback whatever it wrote.
How this will work in practice is a little harder. It looks like we'll need a time-bound clear of all the keys it wrote, which we can get by adding a time-bound option to ClearRange. (some fancy o(1) metadata update that was then checked during MVCC reads could work, it has been judged too complicated to get right when proposed in the past). We can then use this to rollback since we know that that as of time X, which we'll use on all the keys in the IMPORT INTO SSTs, the table was offline and no other writes could have happened, so a delete of all keys >= X will rollback the IMPORT and just the IMPORT.
If or when IMPORT INTO is allowed to overwrite keys, we'll additionally need to prevent GC of the overwritten key until we're sure we will not rollback, because if we shadowed e.g. a year old key, it could be GC'ed more or less immediately, making it impossible to roll back to it by deleting the key that shadowed it. To do this we'll likely want to add some flag to the ingestion of such "provisional" data that prevents GC on that range until a later RPC says that we will not try to rollback. For now though, with the intial version disallowing any key overwrites, this is a non-issue.
The text was updated successfully, but these errors were encountered: