Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spanner: how should transactional promises look? #2152

Closed
stephenplusplus opened this issue Mar 29, 2017 · 22 comments
Closed

spanner: how should transactional promises look? #2152

stephenplusplus opened this issue Mar 29, 2017 · 22 comments
Assignees
Labels
api: spanner Issues related to the Spanner API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: question Request for information or clarification. Not an issue.

Comments

@stephenplusplus
Copy link
Contributor

stephenplusplus commented Mar 29, 2017

Originally posted by @callmehiphop

@vkedia @stephenplusplus @lukesneeringer @jgeewax so our typical callback vs. promise convention is pretty simple - omit the callback and you shall receive a promise. I think this is the first API that really complicates this approach.

Consider the following example

database.runTransaction()
  .then(transaction => {
    return Promise.all([
      transaction,
      transaction.getRows('Singers')
    ]);
  })
  .then([transaction, rows] => {
    // do some kind of mutation based on the rows
    return transaction.commit();
  })
  .then(() => {
    // do something async/unrelated to the transaction
  });

My main issue is that in order to properly retry our transaction here, we would need to capture all functions passed to the promise before transaction.commit() and retry them in the correct order before allowing any chaining that occurs after to continue. Technically I don't think this is even feasible, but even if it were I think the overall behavior itself is totally unpredictable for our users. Based on this opinion, I think that our traditional approach will not fly here.

That being said, if we were to continue to try and let both callbacks and promises live together within a single method, we need a different way of letting the user tell us that they want promises.

My current suggestion is to let the user return a promise within the callback (like so)

database.runTransaction((err, transaction) => {
  if (err) return Promise.reject(err);

  // do some transactional based stuff
  return transaction.commit();
})
.then(() => {
  // transaction is committed, so some other stuff now
});

However after some discussion in my PR, this also may be confusing for the users. So I would like to revisit this specific API since we don't have a precedent for this specific issue.

An idea that we've been kicking around would basically be to not directly support promises in and out of this method, but only in.. This means the code would resemble the following

database.runTransaction((err, transaction) => {
  if (err) {
    // an error occurred while trying to get/create a session + transaction
  }

  // do reads/writes/etc. here
  
  // now the user has a choice between using a promise or a callback
  // and while this function will be ran multiple times in the event of an aborted error
  // commit will not resolve until either a success or final error occurs
  transaction.commit()
    .then(() => console.log('committed successfully!'))
    .catch((err) => console.error(err));
});

At which point if a user doesn't feel that this is the way they want to use Promises, it would be fairly easy to wrap that into something like

function runTransactions() {
  return new Promise((resolve, reject) => {
    database.runTransaction((err, transaction) => {
      if (err) return reject(err);
      // do transaction stuff
      transaction.commit().then(resolve, reject);
    });
  });
}
@stephenplusplus
Copy link
Contributor Author

A quick "A or B" is:

// A)
datastore
  .runTransaction(function(err, transaction) {
    if (err) {
      // Could not start transaction 
      return Promise.reject(err)
    }

    // ...transaction stuff...
    return transaction.commit()
  })
  .then(success)
  .then(doSomethingElse)
  .catch(handleErrors)

// B)
datastore.runTransaction(function(err, tranaction) {
  if (err) {
    // Could not start transaction  
  }
  
  // ...transaction stuff...
  transaction.commit()
    .then(success)
    .then(doSomethingElse)
    .catch(handleErrors)
})

Gist: https://gist.github.com/stephenplusplus/3cb49529ee537c640a4670edf900e294

@stephenplusplus stephenplusplus added api: spanner Issues related to the Spanner API. type: question Request for information or clarification. Not an issue. labels Mar 29, 2017
@lukesneeringer
Copy link
Contributor

lukesneeringer commented Mar 29, 2017

I am opposed to the current solution being discussed.

A question for @vkedia:

  • If you have a series of transaction.doSomething calls that succeed, and then transaction.commit fails with a retryable error, is there a reason why we can not just retry transaction.commit? Does Spanner auto-rollback the transaction if a commit call returns 503, for example? These are separate GRPC calls; from the client perspective, a failed commit call should not necessitate retrying every previous call in the transaction.

A comment:

  • An uncommitted transaction is just a batch. It is totally valid to have methods like insert and update all save to a list of calls on the transaction so that they can be replayed in a retry situation. The transaction object is being passed down the Promise, and it is fine for it to maintain state.

@vkedia
Copy link
Contributor

vkedia commented Mar 29, 2017

@lukesneeringer There are two levels of retries here. In case of a retryable error like UNAVAILABLE, gapic would take care of retries and it would retry just the commit call. The retry we are discussing is when commit fails with ABORTED which usually means that there was some other conflicting transaction but there might be other reasons for that. In that case the correct thing to do is to create a new transaction again and do all the operations on that new transactions again. If you just replay the commit, it would again fail with ABORTED.
This page talks more about this:
https://cloud.google.com/spanner/docs/transactions#rw_transaction_interface

@lukesneeringer
Copy link
Contributor

Thanks, that answers my question. So, the correct solution becomes to save the operations that were made on the transaction and replay them.

@stephenplusplus
Copy link
Contributor Author

Yes, and we also have to replay arbitrary user code that may have been executed before, after, and in-between the calls to our API, e.g. reading a file from disk, making a remote request, querying Datastore, etc. That's why I'm drawn towards design B, which creates a sandbox that a transaction lives in. Our docs can explain that everything inside of it has the chance of being re-run completely.

I also think it's less prone to user error. They won't have to remember to return a promise from the runTransaction callback, which is a pattern that you might have to think about twice before understanding. It also creates the awkward dual-handling of an error-- first, in the runTransaction callback, then in the catch block.

I dislike B because the user cannot easily connect datastore.runTransaction() into another promise pipeline. However, given the sandbox philosophy of a transaction, I think this is acceptable for the benefits of less potential for confusion seen in A.

@callmehiphop
Copy link
Contributor

callmehiphop commented Mar 30, 2017

Something that @vkedia has asked about a few times is why we make the user manually call commit and I think that by abstracting that away, we might be able to solve this. In an earlier version of Datastore we never used to make the user call commit directly, instead we had them pass in multiple callbacks, one of which we supplied a done function to let us know that the transaction was finished.

Pretending we took a similar approach here, here's how our callback version would look

database.runTransaction((transaction, done) => {
  transaction.run('SELECT * FROM Singers', (err, rows) => {
    if (err) {
      done(err);
      return;
    }

    transaction.insert('Singers', { ... });
    done();
  });
}, (err) => {
  // the transaction is finished
});

The introduction of the second callback would mean we can now guess whether or not the user wants a Promise in the same fashion as our other APIs.

database.runTransaction((transaction) => {
  return transaction.run('SELECT * FROM Singers').then([rows, resp] => {
    transaction.insert('Singers', {});
  });
}).then(() => {
  // the transaction is finished
});

I think the downside here is that we still have to pass in a callback that returns a promise, but I don't think there's any avoiding that due to the retry requirement for aborted transactions.

Thoughts?

@stephenplusplus
Copy link
Contributor Author

I think the downside here is that we still have to pass in a callback that returns a promise, but I don't think there's any avoiding that due to the retry requirement for aborted transactions.

Doesn't B avoid this? The user wouldn't have to return the promise, they would just use whatever pattern they wanted (promise or not) inside the sandbox.

Taking away the commit() action makes it more confusing to me, since then the user is returning a transaction.run() to the runTransaction() callback and handling the hidden commit() outside of runTransaction().

Just to re-state, users did not like when we had a done() for Datastore transactions. Related:

@callmehiphop
Copy link
Contributor

Doesn't B avoid this? The user wouldn't have to return the promise, they would just use whatever pattern they wanted (promise or not) inside the sandbox.

Sorry, what I meant was that there is no avoiding providing a callback (not the returning the promise part)

Taking away the commit() action makes it more confusing to me, since then the user is returning a transaction.run() to the runTransaction() callback..

That's a fair point, but it seems like maybe coming up with a better name for runTransaction could cut back on confusion here

..and handling the hidden commit() outside of runTransaction().

I think you're not really handling commit per se, but when the transaction is finished in general - like if an error were to occur at any time during or if everything went fine. To me this is very reminiscent of async.

Just to re-state, users did not like when we had a done() for Datastore transactions.

That is definitely something to consider but I think given the current options for our problem, if we end up supporting a promise mode for this, there will likely be some aspect of it the users don't like and/or find confusing.

@lukesneeringer
Copy link
Contributor

Yes, and we also have to replay arbitrary user code that may have been executed before, after, and in-between the calls to our API

Do we? What if we only replayed the SQL, regardless of how that SQL came to be?

That is less powerful than replaying arbitrary user code, and there are reasons to consider against doing it, but it is a far easier solution.

That said, we can replay arbitrary code too, as long as the code is passed as a function. That actually more or less becomes the "atomic transaction" code I originally posted.

(Editor's note: I do not have throughput to follow this thread right now, but it is critically important. Keep discussing it, but please do not make any final decisions without a meeting. Thanks.)

@bjwatson bjwatson added priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. status: release blocking labels Mar 31, 2017
@ace-n
Copy link
Contributor

ace-n commented Apr 5, 2017

So, the question here is which of these two is going to be the bigger gripe for our users:

  • breaking the idiomaticity of our sample code by using callbacks (as @vkedia proposed here)
  • forcing users who want to use dynamic queries to use static ones (as @lukesneeringer suggested)

Personally, I think that someone will be unhappy regardless of which one we choose. I'm leaning slightly towards @vkedia's proposal, but I'm not sold on anything yet.

@lukesneeringer
Copy link
Contributor

What if we passed in a function, and then the entire function could just be re-run?

@vkedia
Copy link
Contributor

vkedia commented Apr 5, 2017

@lukesneeringer So how is that different from the current callback based API? Are you saying that the passed in function would return a Promise?

@ace-n
Copy link
Contributor

ace-n commented Apr 5, 2017

I'm merging @vkedia's linked PR to our samples repo now, as it fixes #2118 (which is arguably a more urgent issue) - but please do continue discussing this. 😄

@lukesneeringer
Copy link
Contributor

Maybe it is not different. Sorry, I derped.

@stephenplusplus
Copy link
Contributor Author

@callmehiphop please correct me if I'm wrong, but as far as how these methods technically behave, I believe we have that part working-- with the exception of a bug that didn't include the transaction ID on the outgoing API requests. With that fixed, we handle the retry logic correctly.

This conversation is to help us find the most obvious API for the user who wants to use promises for transactions. Flip back to the A or B post: #2152 (comment) -- A is what we previously had, until we removed it, and B is a refactor.

@callmehiphop
Copy link
Contributor

@stephenplusplus that is correct!

@stephenplusplus
Copy link
Contributor Author

Unless we hear other opinions, I think it's up to us, @callmehiphop! From a code-only standpoint (not factoring in anything other than the UX of the API), which one would you like to move forward with, A or B?

@callmehiphop
Copy link
Contributor

@stephenplusplus going to go with B on this one - essentially we just won't support Promises on this method and if the user wants to use them they'd have to wrap it themselves.

@lukesneeringer
Copy link
Contributor

Yeah, I agree with that plan.

@bjwatson
Copy link

I discussed this with @vkedia and @jgeewax a few weeks ago. The problem with removing promises from certain methods (IIUC, runTransaction() and the streaming methods) is that users might use a third-party promises implementation like Bluebird, thus re-introducing the problem we want to avoid. A more robust solution would be to return promises for these methods (if the user does not provide a callback), but don't automatically retry ABORTED transactions in a promise.

What do you think @stephenplusplus, @callmehiphop, and @swcloud?

@callmehiphop
Copy link
Contributor

@bjwatson I think not retrying for promises makes sense to me. As it is runTransaction will retry all the things and we can create a new public method like getTransaction or beginTransaction etc. that does not have that logic baked in and can be used via promises.

@bjwatson
Copy link

@callmehiphop SGTM. As long as we provide some kind of promise mechanism for that method, then users are unlikely to seek out their own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the Spanner API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

6 participants