Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Note about underlyingSink close() method is misleading #617

Closed
ricea opened this issue Nov 21, 2016 · 19 comments
Closed

Note about underlyingSink close() method is misleading #617

ricea opened this issue Nov 21, 2016 · 19 comments

Comments

@ricea
Copy link
Collaborator

ricea commented Nov 21, 2016

https://streams.spec.whatwg.org/#ref-for-underlying-sink-10

The stream implementation guarantees that this method will be called only after all queued-up writes have succeeded.

Except if abort() is not defined on the underlyingSink, and user code calls abort(). In that case close() can be called concurrently with write(), possibly wreaking havoc with the assumptions of the author of the underlyingSink.

I would prefer if this note included a qualification like "except when abort() is called, as mentioned below".

close() can also be called while an existing async call to close() is already in progress, but that might be a spec bug.

@tyoshino
Copy link
Member

This made me think that it might make more sense if we change the place of "fallback" from WritableStreamDefaultControllerAbort() to WritableStreamDefaultControllerProcessClose(), and change the "fallback" to be invocation of underlyingSink.abort() in absence of underlyingSink.close().

As ricea said, underlyingSink.close()'s semantics change depending on presence of underlyingSink.abort(). If we change this to be what I suggested above:

  • developers would implement underlyingSink.close() if they want to notified only when all write()s completed successfully,
  • or implement only underlyingSink.abort() if they're not interested in the distinction. Treating writer.close() with underlyingSink.abort() looks weird, but it's also weird that we're handling writer.abort() with underlyingSink.close() in absence of underlyingSink.abort().

close() can also be called while an existing async call to close() is already in progress, but that might be a spec bug.

Agreed. When only close() is implemented, the implementation should be doing something common for both writer.close() and writer.abort(). It doesn't make sense to call it twice. They can still switch the behavior for the first and second invocation, but they should then just implement close() and abort() instead.

@ricea
Copy link
Collaborator Author

ricea commented Nov 22, 2016

I think what @tyoshino said makes sense, but an API where writer.close() called underlyingSink.abort() would make people think we were insane.

I had another idea, which is to replace close() and abort() on underlingSink with shutdown() and destroy(why, e). The key difference is that destroy() would always be called if it was defined.

writer.close() would wait for all queued writes to complete, then call underlyingSink.shutdown(), then underlyingSink.destroy('close', undefined).
writer.abort(e) would call underlyingSink.destroy('abort', e).
writer.close(); writer.abort(e); would call underlingSink.shutdown() and underlyingSink.destroy('abort', e);, possibly concurrently, if there were no queued writes.

shutdown() is named by analogy with BSD sockets shutdown(2), ie. it cleanly terminates the connection without freeing local resources. It would typically be used for flushing buffers.
destroy(why, e) is named like the common pattern in garbage-collected or reference-counted systems when some resources need to be explicitly freed. This would often map to a close() call internally. In systems where there are two exclusive ways to free the resource, for example commit() and rollback(), this would have to switch on why.

This would be a big API change and I am not sure of the benefits.

@domenic
Copy link
Member

domenic commented Nov 22, 2016

I think my preferred solution is that we don't call us.abort() until us.write() settles. Thus, writer.abort()'s primary goal is to flush away any queued up writes, not to abort any ongoing writes.

Right now no part of the streams API has the concept of interrupting an ongoing operation, really. If we were to introduce that, it would probably be based on cancelable promises, and apply more broadly. For example, us.write() and writer.write() would accept a cancel token. us.write()'s cancel token would become canceled if either the token passed to writer.write() is called, or if writer.abort() is called. Then us.write() can decide whether it wants to cancel the ongoing write (or at least the returned promise) and thus allow us.abort() to execute.

But for now, without cancel tokens and in general without a concept of canceling an ongoing "atomic" operation, I think it's best to treat each write as atomic, and not something that can be interrupted by abort.

What do you think?

@domenic
Copy link
Member

domenic commented Nov 22, 2016

I just realized something else related... I'll keep it in this thread because it reminds me of @ricea's suggestion.

Right now, if us.write() fails, the stream will go in to the errored state---but no cleanup code will be called. That is, underlying sink writers need to take care to catch any errors writing to their underlying sink and then perform cleanup if so. This is not obvious: for example, https://streams.spec.whatwg.org/#example-ws-backpressure seems to miss that. It should instead be e.g.

    async write(chunk) {
      try {
        await fs.write(fd, chunk, 0, chunk.length);
      } finally {
        await this.close();
      }
    }

(with the code even more complex than this if you don't use async functions and finally.)

This seems pretty bad and I'm sad we hadn't realized it before.

The guiding goal behind the current setup is roughly that you should only have to implement cleanup logic in one place. That's what guides the fallback in abort. I've never been super-happy with the implementation of that goal as-is, since it leads to strange situations like allowing close + abort, or apparently a bunch of interesting edge cases we're just now discovering.

I think @ricea's destroy + shutdown might be too much, because it seems geared toward allowing the interruption of ongoing writes, which as in my previous comment I think we should not really support. But I think it's indeed worth rethinking how to accomplish this goal.

One idea would be just shutdown(why, reason) { } which is always called any time the stream is transitioning from "writable" to "closed" or "errored". So, for: failed writes; writer.abort() (after the ongoing write finishes); writer.close() (after all queued writes finish).

@tyoshino
Copy link
Member

@domenic

For example, us.write() and writer.write() would accept a cancel token. us.write()'s cancel token would become canceled if either the token passed to writer.write() is called, or if writer.abort() is called. Then us.write() can decide whether it wants to cancel the ongoing write (or at least the returned promise) and thus allow us.abort() to execute.

Since the WritableStreamDefaultController has only one active us.write() at a time, I agree it makes sense to use a cancel token to communicate an abort signal from the controller to the sink as just an interface between them.

Having writer.write() take a cancel token in addition to the value argument, and handle/pass it in the Streams API is a separate design decision. This affects not only the controller(s), but also the WritableStream's semantics.

Suppose that we have issued 3 write()s on a writer, w0, w1 and w2. We cancel the token for w1. I think what's expected is that:

  • if the sink hasn't started processing w1,
    • finish w0
    • throw away w1, w2
    • invoke sink.abort()
  • if the sink is processing w1,
    • cancel the token passed to the sink with w1 and wait for completion
    • (regardless of the result of w1 processing, continue to do the following)
    • throw away w2
    • invoke sink.abort()
  • if the sink has already finished processing w1, and is now processing w2,
    • cancel the token passed to the sink with w2 and wait for completion
    • (ditto)
    • invoke sink.abort()

Is this the same as what you had in your mind?

@tyoshino
Copy link
Member

tyoshino commented Nov 24, 2016

With this, we gain more precise control over the timing of abort. I'm not sure if this is something we really want. Looks over-killing at a glance.


Let me go back to the discussion about need for interrupting an ongoing us.write(). I'm okay with making us.write() atomic. The primary plan is to introduce the cancel token later. The backup plan in case of delay or trouble in cancel token standardization is maybe that we invoke a separate callback, say us.abortWrite(), when there's ongoing us.write() while keeping us.abort() happen only after the ongoing us.write() finishes.

@ricea
Copy link
Collaborator Author

ricea commented Nov 24, 2016

I agree with making us.write() and us.close() non-interruptible. It gains us the guarantee that no two methods on the underlying sink are ever called simultaneously, which is very easy to understand and work with.

The temporary loss of functionality is unfortunate, but I think it's acceptable given that we have a clear path towards making write() interruptible again in future.

@tyoshino
Copy link
Member

The guiding goal behind the current setup is roughly that you should only have to implement cleanup logic in one place.

I agree this is an important point and analogy with the finally clause makes me also rethink it carefully.


@ricea Does your proposal #617 (comment) imply removal of the fallback? It's replaced with allowing developers to make shutdown no-op, and ignore the first argument why and do one common thing in destroy()?

@ricea
Copy link
Collaborator Author

ricea commented Nov 24, 2016

Does your proposal #617 (comment) imply removal of the fallback? It's replaced with allowing developers to make shutdown no-op, and ignore the first argument why and do one common thing in destroy()?

Yes. I expect that implementing only destroy() and ignoring both arguments would be the most common pattern. You'd only implement shutdown() if you had data to flush or your sink had some concept of clean shutdown. For example with WebSockets, shutdown() would send the Close frame and wait for the response. destroy() would then close the socket.

I agree this is an important point and analogy with the finally clause makes me also rethink it carefully.

Actually, finally() might be a better name than destroy().

Having said that, I'm not seriously pushing my proposal. I think @domenic has provided a good path forward with the existing API, and I don't think it's worth giving it up in exchange for something of doubtful benefit.

@ricea
Copy link
Collaborator Author

ricea commented Nov 25, 2016

I spoke with @tyoshino in person and here is the plan we came up with:

  • Calling writer.abort() during us.write() will result in all processing being delayed until the write completes. If the write succeeds, abort() or close() will be called as normal. If the write fails, behaviour will match the resolution to us.abort() is not called if us.write() fails #620.
  • Calling writer.abort() during us.close() will return a promise rejected with a TypeError exception.
  • Calling writer.abort() with state === 'closing' will still be permitted as long as us.close() has not started yet.

@domenic
Copy link
Member

domenic commented Nov 26, 2016

Is this the same as what you had in your mind?

That wasn't quite what I had in mind. Cancel tokens would only be used to control a single write() instance, and wouldn't change the state of the stream. You'd do that separately by calling writer.abort(). So:

  • If the sink hasn't started processing w1
    • Finish w0
    • Throw away w1; return a "canceled promise" from that write() (rejected with a Cancel instance; more details)
    • Start w2
  • If the sink is processing w1
    • cancel the token passed to the sink with w1 and wait for completion. Return a promise corresponding to what the sink returns (so it might be fulfilled or rejected, if the cancelation didn't work, or it might be canceled, if it did).
    • Start w2
  • If the sink has already finished processing w1, and is now processing w2,
    • Start w2; ignore the cancelation request

I'm glad everyone seems on board with making us.write() and us.close() non-interruptible. I think that is a good simplification, despite the temporary loss in functionality.

I spoke with @tyoshino in person and here is the plan we came up with:

This plan sounds good to me, but I am interested in exploring a shutdown/destroy/finally underlying sink API. Maybe we should wait until the dust settles and see how bad things are after resolving this issue and #620 before deciding if that is necessary.

@ricea
Copy link
Collaborator Author

ricea commented Nov 28, 2016

A minor exception I discovered:

  • writer.abort() will cause writer.ready to reject immediately, ie. without waiting for a pending us.write() to complete. This is for consistency with writer.close() and existing behaviour.
  • The same applies to controller.error().

@tyoshino
Copy link
Member

That wasn't quite what I had in mind. Cancel tokens would only be used to control a single write() instance, and wouldn't change the state of the stream. You'd do that separately by calling writer.abort(). So:

Thanks Domenic. I'm fine with decoupling cancellation functionality from the Streams semantics. But does this mean that the WritableStream will take another argument but passes it through to the underlying source without doing anything with it? Or, is it supposed that the value argument will include the cancel token in some way? Just wanna clarify.

@domenic
Copy link
Member

domenic commented Nov 28, 2016

A minor exception I discovered:

I think this is fine, given the semantics of writer.ready as an indicator of what the producer should do next. We're going for non-interruptible, not un-observable. Do you all agree?

But does this mean that the WritableStream will take another argument but passes it through to the underlying source without doing anything with it?

Yeah. The thing about cancel tokens is that they represent cooperative cancelation for a given operation. So the stream implementation needs to know if the underlying sink cooperated with the cancelation or not, before it changes any of its state. It does that by looking at the return values of the underlying sink's methods---did they return a canceled promise, or no? If no, then they did not cooperate with the cancelation request, so we should ignore it.

It would be a different story if we were passing the cancel token to some operation controlled mostly or entirely by the stream implementation. For example, if we were passing a cancel token to pipeTo(), we could pass that through to any read or write calls and look at how they cooperated, but would we would also interpret a cancelation request on that token to mean "stop piping", so even if an individual write or read does not cooperate because of an uncooperative underlying sink, we could stop the pipe loop as soon as possible.

@ricea
Copy link
Collaborator Author

ricea commented Nov 29, 2016

I think this is fine, given the semantics of writer.ready as an indicator of what the producer should do next. We're going for non-interruptible, not un-observable. Do you all agree?

This wasn't what I initially though the behaviour should be. But when I saw that close() worked that way, I decided it made sense. .ready is "can I write now without blocking?". .ready being unresolved implies that you may be able to write without blocking at some time in the future. .ready begin rejected means you'll never be able to use this Writer to write again. New calls to write() will also reject at this point, so it is consistent.

Getting into fine details, .closed will reject before .ready if there is no write pending, but will reject after .ready otherwise. Maybe we should change it to always reject after .ready?

Yeah. The thing about cancel tokens is that they represent cooperative cancelation for a given operation. So the stream implementation needs to know if the underlying sink cooperated with the cancelation or not, before it changes any of its state. It does that by looking at the return values of the underlying sink's methods---did they return a canceled promise, or no? If no, then they did not cooperate with the cancelation request, so we should ignore it.

I've become confused now. Is writer.abort() going to cancel an ongoing write? Or if you want "abort ASAP" semantics will you have to keep cancel tokens for all unfulfilled write() calls and loop through them after calling writer.abort()?

@domenic
Copy link
Member

domenic commented Nov 29, 2016

Getting into fine details, .closed will reject before .ready if there is no write pending, but will reject after .ready otherwise. Maybe we should change it to always reject after .ready?

That sounds like a good idea to me.

I've become confused now. Is writer.abort() going to cancel an ongoing write? Or if you want "abort ASAP" semantics will you have to keep cancel tokens for all unfulfilled write() calls and loop through them after calling writer.abort()?

The latter was my thinking initially, but when you put it that way, it sounds rather unattractive. I guess we can hand us.write() a cancel token that is the result of "racing" an internal cancel token, which is tied to writer.abort(), plus the cancel token passed to writer.write(). That way calling writer.abort() will send a cancelation request to any ongoing us.write() even if nobody sends one to the cancel token passed to writer.write().

@tyoshino
Copy link
Member

It's the same as what I had in my mind. OK.

writer.write() taking a cancel token would make the WritableStream/WritableStreamDefaultWriter a bit weird since the interfaces exposed to controllers do nothing with promises, but should be fine.

@ricea
Copy link
Collaborator Author

ricea commented Dec 1, 2016

This is now implemented in my pull request #619 (feel free to clone that into a whatwg branch if it is easier). Sorry for the delay. I haven't written the standard changes yet, but that should be the easy part.

During implementation, I decided that one of the choices I made in my earlier comment #617 (comment) was not that great and changed it. Specifically:

  • Calling writer.abort() during us.close() waits for the close to complete. It's natural to expect that once abort() returns, the stream is no longer active, which wouldn't be the case if us.close() was still in progress.
  • When us.close() is in progress, the return from writer.abort() reflects the result of the close. If it succeeded then we didn't really abort, but the stream was cleaned up which is probably what the caller wanted.

In short, I made abort() succeed in more cases, because success is more useful than failure.

@domenic
Copy link
Member

domenic commented Jan 6, 2017

This was fixed in #619.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants