Skip to content

Commit

Permalink
Better idempotency definition
Browse files Browse the repository at this point in the history
  • Loading branch information
oggy-dfin committed Sep 19, 2024
1 parent 451626d commit 1f14dc6
Showing 1 changed file with 6 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Thus, it is important to design and/or use canister APIs such that it is possibl

## Idempotent canister APIs

We say that a canister endpoint is idempotent if it can be called multiple times without changing the result or canister state beyond the first call. Whenever an endpoint is idempotent or can be made idempotent by the developer, this provides an easy way to implement safe retries.
We say that a canister endpoint is idempotent if executing it multiple times is equivalent to executing it once.[^1] Whenever an endpoint is idempotent or can be made idempotent by the developer, this provides an easy way to implement safe retries.

Given an idempotent endpoint, you can implement retries by retrying the call until you observe a certified response (either a replied or rejected status), see the illustration below. If such a response is ever observed, it's sure that the transaction has been executed at least once which, thanks to idempotency, has the same result as executing it exactly once. However, the application may not be willing to wait for a response indefinitely and a timeout could be implemented. Upon timeout, an error should be displayed to the user instructing them to wait until the latest message that has been sent has expired (as defined by the request's `ingress_expiry`) and then manually check the status of the transaction. Ideally, timeouts should be rare and not occur during normal operation.

Expand Down Expand Up @@ -52,7 +52,7 @@ For example, the ICRC ledger standard provides deduplication in this way. Using
However, a direct implementation of this approach can exhaust the canister memory, as all successfully executed IDs need to be kept around forever.
Thus, the deduplication is usually time limited to a certain time window. For example, the ICP ledger uses a 24 hour window, and the ICRC standard defines a configuration parameter `TX_WINDOW` that determines the window length.

Moreover, the ICP/ICRC ledgers use the `created_at_time` parameter to limit the validity period of a call. Roughly, the call is only considered valid if its `created_at_time` is not in the future and at most 24 hours in the past.[^1] This avoids the problem where the deduplication window expiring would allow a retried call to succeed again.
Moreover, the ICP/ICRC ledgers use the `created_at_time` parameter to limit the validity period of a call. Roughly, the call is only considered valid if its `created_at_time` is not in the future and at most 24 hours in the past.[^2] This avoids the problem where the deduplication window expiring would allow a retried call to succeed again.

But even with this improvement used in the ledgers, the time window approach implicitly assumes that the client will be able to get a definite answer to their call within the time window. For example, after the 24 hours expire, the user cannot easily tell if their ledger transfer happened; their only option is to analyze the ledger blocks, which is somewhat tedious, and has to be done carefully to avoid asynchrony issues; see the section on [queryable call results](#queryable-call-results).

Expand All @@ -76,7 +76,7 @@ In absence of idempotent endpoints, or even in addition to them, clients may be
If the canister, in addition to the update endpoint, also exposes a query that can inform the user of the result of the update, the client can also use this for safe retries as follows:

1. Attempt to perform the update.
1. If the result of the update is unknown (e.g., not present in the ingress history anymore), query the call result endpoint to determine whether the update was applied or not. Moreover, one needs to ensure that the previously sent call cannot be applied in the future. If both of these are true, the call might be retried, or safely reported as failed.
1. If the result of the update is unknown (e.g., not present in the ingress history anymore), query the call result endpoint to determine whether the update was applied or not. Moreover, one needs to ensure that the previously sent call cannot be applied in the future. If both of these are true, the call might be retried or safely reported as failed.

In practice, this pattern may be more complicated. For example, the ICP ledger exposes a `query_blocks` method that can be used to implement the above pattern for transfers initiated as ingress messages:

Expand All @@ -94,4 +94,6 @@ Another approach applicable to ledgers (such as ICRC-1 or ICP) is to perform tra
1. If the transfer to the transaction-specific subaccount succeeded (as determined either by the transfer result or by the balance query above), the client sends another transfer from the transaction-specific subaccount to the desired target account. This can be repeated as many times as necessary until a result of the call is known. Once a result is known, the overall transfer can be declared as succeeded, even if this step fails with an error, as this signifies that some previous attempt to transfer the money to the target succeeded.


[^1]: More precisely, the ledger also allows for a small time drift of `created_at_time` into the future, which has to be taken into account when clearing the deduplication window.
[^1]: "Equivalent" is meant from the user perspective here. Multiple executions may trigger changes such as those in the canister's cycle balance, but they are not relevant for the user.

[^2]: More precisely, the ledger also allows for a small time drift of `created_at_time` into the future, which has to be taken into account when clearing the deduplication window.

0 comments on commit 1f14dc6

Please sign in to comment.