Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CosmosStore: Cleanup Cosmos calls #303

Merged
merged 7 commits into from
Jan 18, 2022
Merged

CosmosStore: Cleanup Cosmos calls #303

merged 7 commits into from
Jan 18, 2022

Conversation

bartelink
Copy link
Collaborator

Long overdue cleanup in CosmosStore - initial logic was intentionally matching V2 init logic, but it can be clarified now based on:

  • support for Autoscale from @belcher-rok
  • the V3 SDK has Database and Container objects where V2 did not

@bartelink
Copy link
Collaborator Author

I couldn't help pulling at the thread ;)
Needs testing before merge and there's no rush

@bartelink bartelink changed the title Cosmos: Refactor init logic Cosmos: Refactor Cosmos calls Nov 19, 2021
@bartelink
Copy link
Collaborator Author

Ping @ealsur I added some answers to your questions above - I still think there should be some way for the API to provide a ramp between a call that forces deserialization regardless of response code versus requiring the consumer to take on all the burden the minute they are interested in handling non-200 cases

@bartelink bartelink changed the title Cosmos: Refactor Cosmos calls CosmosStore: Refactor Cosmos calls Jan 6, 2022
@bartelink
Copy link
Collaborator Author

@ealsur bump - would be nice to conclude this discussion; it feels like this is a common requirement that the SDK does not address cleanly?

@ealsur
Copy link

ealsur commented Jan 6, 2022

@bartelink Isn't the ResponseMessage.IsSuccessStatusCode property and ResponseMessage.EnsureSuccessStatusCode() method if you want an exception to be thrown enough to fill that gap currently?

@bartelink
Copy link
Collaborator Author

bartelink commented Jan 6, 2022

@ealsur Its about avoiding the exception if you are handling a non-200 case. Let me try to explain betrer...

The ask is for something like my DeserializeResponseBody helper in:
https://github.com/jet/equinox/blob/tidy-cosmosinit/src/Equinox.CosmosStore/CosmosStore.fs#L389-L391

This enables the conditional reading of the body by first looking at the status code in https://github.com/jet/equinox/blob/tidy-cosmosinit/src/Equinox.CosmosStore/CosmosStore.fs#L392-L398

A pidgin C# translation would be

Result TryReadItem<T>(this container, string id, PartitionKey pk) {
    var rm = await container.ReadItemStreamAsync<T>(id, partitionKey, options, ct)
    rm.StatusCode switch
        case System.Net.HttpStatusCode.NotFound: return Result.NotFound
        case System.Net.HttpStatusCode.NotModified: return Result.NotModified
        case default: Result.Found(container.DeserializeResponseBody<T>(rm))

The point is you want something pithy that lets you deserialize to a specific type after you have checked the StatusCode on the ResponseMessage ( container.DeserializeResponseBody<T>(rm) in my code)

... figuring out that what ReadItemAsync does to achieve this is "simply" container.Database.Client.ClientOptions.Serializer.FromStream<'T>(rm.EnsureSuccessStatusCode().Content) was not obvious to me.

How exactly to surface it give the serializer is on the container/client and not known to the ResponseMessage is the fun bit. Things that spring to mind for me are all ugly, i.e. the response including a Func<T> and/or a Lazy<T> doesn't seem to resolve the forces well.

@ealsur
Copy link

ealsur commented Jan 6, 2022

So the ask is how to achieve typed APIs (for example, ReadItemAsync) but not throw exceptions? In your code, it would really be:

Result TryReadItem<T>(this container, string id, PartitionKey pk) {
    var rm = await container.ReadItemStreamAsync<T>(id, partitionKey, options, ct)
    if (rm.IsSuccessStatusCode)
         return Result.Found(container.DeserializeResponseBody<T>(rm))
    else
        result Result.Failed(rm.StatusCode)

This is completely missing the Diagnostics information though, so that needs to be taken in consideration.

I don't see the SDK changing the behavior (type APIs not throwing), so the ask is, what is the official snippet or recommendation?

@bartelink
Copy link
Collaborator Author

bartelink commented Jan 6, 2022

Yes and no. The ask is:

  • do the call, throw on failure etc
  • expose the StatusCode etc without triggering the reading/decoding/waiting for the content to get it
  • (if I'm still interested after that check) let me await and decode the body

The problem as I see it now is that any "snippet" / advice involves me having code my side doing container.Database.Client.ClientOptions.Serializer.FromStream<'T>(rm.EnsureSuccessStatusCode().Content), which works now but is not exactly clean or future proof

Update: Yes, your snippet as above would be acceptable; effectively the ask is to provide some way to extract and deserialize the body as exactly as ReadItemAsync does (but not pay the price/trigger exceptions if I opt not to do that)

One thing to call out is that I definitely want to see the full ResponseMessage, to extract the RequestCharge in my case, but obv others will have similar needs

@ealsur
Copy link

ealsur commented Jan 6, 2022

I'm truly sorry but there is a part I don't understand in the sequence, the only scenario where the shared Tweet makes sense is from a performance perspective (throwing causes an overhead). Assuming the operation was a failure and an exception was thrown, there is no body to deserialize. The body is only deserialized on the success path (on the failure really the backend did not send any body).

Let's run a scenario through the proposed steps, for example, a ReadItem returning 404.

  1. The server returns a 404, there is no body in the response.
  2. The SDK throws a CosmosException, the exception contains the 404 StatusCode.
  3. What body would you decode if the serializing is deferred? There is nothing to read.

All the error code situations contain a server response whose body is either empty or might contain some extra information of the error itself (case of TransactionalBatch), but nothing that it's useful for the original operation.

If your intent is simply to defer serialization only on success cases, then that's what the SDK does already.
If your intent is to not have exceptions (what the Tweet was asking) then I don't think that's a behavior that can change in the SDK and the only "workaround" would be something like the proposed snippet.

@bartelink
Copy link
Collaborator Author

bartelink commented Jan 6, 2022

The primary ask is to avoid exceptions - I fully expect and am ready to handle 304. It's not Exceptional. Its like Parse vs TryParse except we want diagnostics for both success and fail cases, not just a bool

As it happens, for my use cases, I want/need the body in all success cases.

If one cannot expect the body to be present, then I don't want to read it.

Dropping to the Stream API happens to make it possible to do what I need, but in the most abstract sense, the need is some variant of ReadItemAsync that allows one to decouple the two steps that are bundled in it at present:
a) do the work; expose the result, BUT don't throw if it is not "Successful"; I want to do my own interpretation, esp for 304
b) let me deserialize the body (feel free throw if I neglected to guard that call with a check for a StatusCode that would imply one should be available)

and the only "workaround" would be something like the proposed snippet.

I am not arguing with this - if the SDK can add a container.DeserializeResponseBody<T>(rm) then I personally am happy and see the issue as resolved.

I'm just trying to surface the whole requirement in case there is a better way to have this facility manifest in the API which e.g. covers more use cases and/or is more discoverable.

I guess a spike reimplementing ReadItemAsync in terms of the proposed explicitly exposed DeserializeResponseBody API might yield ideas...

@ealsur
Copy link

ealsur commented Jan 6, 2022

I see, so basically the ask is an API to invoke the internal serializer with a ResponseMessage response? For point operations it would be simple, but the real issue is for query responses or other feedtype responses.

For example, if you have a ResponseMessage that represents a ReadItem, you could call whatever.DeserializeBody<T>(ResponseMessage) where T is your type intended for the ReadItem. But if the ResponseMessage is coming from a ReadNextAsync from a query, then a generic API like whatever.DeserializeBody<T>(ResponseMessage) would be prompt to failure. The problem is the input can be anything.

@kirankumarkolli / @j82w - Do you folks have any thoughts on this?

@bartelink
Copy link
Collaborator Author

Yes, that represents the ask well.

You raise good points about this only applying for individual [Try]ReadItemAsync Response Messages, and I would definitely agree that putting it in the wrong place could make a mess that no amount of remarks in xmldoc can save people from.

Perhaps a "V4"-timescale revised API design can expose it in a way that is discoverable and covers all the cases. ... an, in the interim the container.Database.Client.ClientOptions.Serializer.FromStream<'T>(rm.EnsureSuccessStatusCode().Content) code snippet can live in the issue that documents the requirement until it's all resolved ;)

@j82w
Copy link

j82w commented Jan 6, 2022

Would adding an operation type to the ResponseMessage allow it to be handled correctly?

@kirankumarkolli
Copy link

ReadItemAsync is a wrapper around stream API, doing exactly the two steps mentioned in #303 (comment)

It's very simple and straight forward. As in most cases having it as an extension method like ReadItemAsync404NotThrow() on Container also might make it more idiomatic.

@bartelink
Copy link
Collaborator Author

It's very simple and straight forward. As in most cases having it as an extension method like ReadItemAsync404NotThrow() on Container also might make it more idiomatic.

I think the problem here is that, as in my case, you frequently want to handle 304 etc in addition to 404, and having a custom extension for each case won't solve that, and you need to epose the ResponseMessage at some point in order to let people pull out the RequestCharge etc.

So for me, any real solution needs to do one of two things:

  1. expose an equivalent of container.DeserializeResponseBody<T>(rm)
  2. have the caller pass a function that can do the equivalent via a Func passed to it

For example, if there was a:

TResult TryReadItem<TResult,TBody>(this container, string id, PartitionKey pk, Func<ResponseMessage, Func<TBody>> interpret, CancellationToken ct);

Then Mathias's example can become:

Result ReadOrNotFound<T>(this container, string id, PartitionKey pk, ct) =>
    container.TryReadItem(id, pk, cancellationToken = ct, interpret = (rm, readBody) =>
        rm.IsSuccessStatusCode
            ? Result.Found(readBody())
            : Result.Failed(rm.StatusCode));

And mine becomes:

        member container.TryReadItem(partitionKey : PartitionKey, id : string, ?options : ItemRequestOptions): Async<float * ReadResult<'T>> = async {
            let! ct = Async.CancellationToken
            let extractResult (rm : ResponseMessage, readBody : unit -> 'T) =
                rm.Headers.RequestCharge, // NOTE I want the request charge in all cases
                match rm.StatusCode with
                | System.Net.HttpStatusCode.NotFound -> NotFound
                | System.Net.HttpStatusCode.NotModified -> NotModified
                | _ -> readBody()|> Found }
            return! container.TryReadItem(id, partitionKey, requestOptions = Option.toObj options, extractResult, cancellationToken = ct) |> Async.AwaitTaskCorrect

While that also solves the problem, the question is how you feel about a Func that passes a Func, and the fact that there are two generic types involved in the signature.

@ealsur
Copy link

ealsur commented Jan 7, 2022

Just a comment, 304 does never happen on ReadItem, the only scenario 304 is involved is for Change Feed operations.

@bartelink
Copy link
Collaborator Author

bartelink commented Jan 7, 2022

Just a comment, 304 does never happen on ReadItem, the only scenario 304 is involved is for Change Feed operations.

I'm over 99% sure that's not true - Equinox makes extensive use of conditional gets with etags to avoid paying for retrieval of bodies that have not change

You can see the condition being applied in this file (Amusingly, I've just spotted I was erroneously labelling and logging as it as 302...)

@ealsur
Copy link

ealsur commented Jan 7, 2022

I'm over 99% sure that's not true - Equinox makes extensive use of conditional gets with etags to avoid paying for retrieval of bodies that have not change

You can see the condition being applied in this file (Amusingly, I've just spotted I was erroneously labelling and logging as it as 302...)

This is super interesting, I was not aware a Read Item could return 304 using ETag (I was under the assumption that ETag was mainly used for optimistic concurrency), and could not find this use case in our documentation either, thanks for sharing.

Having a generic deserializing method in my opinion will bring a deal of issues due to mis-use, there is no amount of documentation we can write that will avoid it.

@bartelink
Copy link
Collaborator Author

Having a generic deserializing method in my opinion will bring a deal of issues due to mis-use, there is no amount of documentation we can write that will avoid it.

Yeah, I can't ague with that.

If an API with a callback to extract from the ResponseMessage was to be provided, I'd definitely use it, but I'll definitely understand if that's not done now ... as long as there is an intent to provide such a facility in the V4 timeframe.

It seems at this point that its reasonable to park this discussion for now with the following conclusions:

  • ReadItemStreamAsync + container.Database.Client.ClientOptions.Serializer.FromStream<'T>(rm.EnsureSuccessStatusCode().Content) is the unofficial way to handle expected 304/404 values without exceptions for the moment
  • It would be good to have an issue noting this shortcoming of the API logged on the repo, and that workaround can be mentioned there

@bartelink bartelink marked this pull request as ready for review January 18, 2022 21:30
@bartelink bartelink changed the title CosmosStore: Refactor Cosmos calls CosmosStore: Cleanup Cosmos calls Jan 18, 2022
@bartelink bartelink merged commit 778cd5a into master Jan 18, 2022
@bartelink bartelink deleted the tidy-cosmosinit branch January 18, 2022 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants