Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support proto2 optional fields #139

Closed
philikon opened this issue Sep 21, 2020 · 22 comments
Closed

Support proto2 optional fields #139

philikon opened this issue Sep 21, 2020 · 22 comments

Comments

@philikon
Copy link
Contributor

philikon commented Sep 21, 2020

Now that we have support for proto3 optional (#73, yay!), should we also add support for proto2's version, LABEL_OPTIONAL? Right now something like optional uint32 index = 42 translates into index: number but IMHO it should be index?: number.

I'd be happy to write the code for this.

@stephenh
Copy link
Owner

stephenh commented Sep 22, 2020

@philikon sure, that sounds like a good idea. Thanks!

@stephenh stephenh changed the title support LABEL_OPTIONAL Support proto2 optional fields Sep 22, 2020
@ssilve1989
Copy link
Contributor

I am using a patch of ts-proto in Bazel + NestJS to achieve this and opened PR #162 with some additional changes on top of what my patch had, to modify fromPartial and fromJson and I think it should be ok

@stephenh
Copy link
Owner

I looked a little bit at this, and the proto2 docs on optional:

https://developers.google.com/protocol-buffers/docs/proto#optional

When a message is parsed, if it does not contain an optional element, the corresponding field
in the parsed object is set to the default value for that field.

So, if we have an optional foo proto, should it become a foo: number b/c when decoded off the wire, it will (?) always be set to 0 if not present, or should it be foo?: number and left as undefined?

Fwiw my guess is that users mostly expect foo?: number but the docs insinuate otherwise.

Another option would be to have it be foo: number but get the default value from a prototype, so that you could still check for presence via message.hasOwnProperty('foo'). That is probably where I would lean.

@ssilve1989
Copy link
Contributor

Optionals are not deserialized into default values. The reader should check if there are any bytes at the field position and if not it gets turned into the language representation of undefined. For ex in JS that's undefined, in Scala it's an Optional of None, etc

@stephenh
Copy link
Owner

Optionals are not deserialized into default values.

I mean, I agree intuitively, that's just not what the proto docs I linked to above assert...

I.e. you say "turned into the language representation of undefined" but they say "turned into the field's default value".

Maybe the nuance is that in Scala/class-based messages, the internal field can be undefined but the generated getter function could still return the default value?

Dunno. I mean, I generally agree with you, and maybe we just do that anyway, even if its not technically what the docs assert (or I'm misunderstanding them).

@ssilve1989
Copy link
Contributor

ssilve1989 commented Feb 23, 2021

I mean, I agree intuitively, that's just not what the proto docs I linked to above assert...

I believe it is. The docs say

well-formed message may or may not contain an optional element. When a message is parsed, if it does not contain an optional element, the corresponding field in the parsed object is set to the default value for that field.

the key words to me being that the default is set when the message does not contain an optional element

for example, i am pretty sure when using just the protoc compiler with js_out to output dts+js optional fields are also not deserialised into default values when the reader is a proto2 reader. It has been a few months since i ran the protoc with js_out but i am pretty sure thats what happens. I've also been using this in production this way against java/scala/python gRPC systems and they do not set default values when reading optional fields off the wire that have not been set.

@stephenh
Copy link
Owner

...so you're saying that the message would actually contain the optional element, i.e. on the wire somewhere is the data that "fieldA is not set"?

I.e. fieldA is actually in the protobuf bytes, it's just in there as "unset".

Fwiw that's not my understanding of how the message serialization works, i.e. a primitive is either a) included and has some set value, or b) is not included and so has no value.

Can you point to docs on how to serialize/deserialize "this element/field (a primitive) is included but not set"?

@ssilve1989
Copy link
Contributor

...so you're saying that the message would actually contain the optional element, i.e. on the wire somewhere is the data that "fieldA is not set"?
I.e. fieldA is actually in the protobuf bytes, it's just in there as "unset".

No, my understanding of it is that the readers know what the message contract is, they know what field positions to look for for bytes and whether they are optional or not, therefore the readers know that when there is nothing in a particular field position, and its an optional field, to set it to the language representation of unset, be that undefined, None or whatever. I am basing this on inspecting various generated readers in various languages generated by protoc. This is different in proto3, but they have a new experimental optional flag for proto3 that I believe yields the same functionality.

@stephenh
Copy link
Owner

This is from the proto2 docs I linked to above:

"When a message is parsed, if it does not contain an optional element, the corresponding field in the parsed object is set to the default value for that field."

Which starts off sounding a lot like your "when there is nothing in a particular field position ... " except then you change to "...set it to the language representation of unset", which I can't find any description of that behavior on that proto2 page.

Fwiw the java tutorial has:

// protoc
optional string email = 3;

// java
public boolean hasEmail();
public String getEmail();

Which matches my recollection of proto2 from when I used it before: the getEmail() getter when unset still returns "" (which is what that proto2 doc snippet I referred to is talking about) and you have to specifically use the hasEmail() "hazzer" to check for "was this "" really sent or am I just getting a default".

I.e my point is generally that class-based languages have two ways (methods) of accessing email: one getter that always returns the default value for the primitive type and one "hazzer" (their term) that says "was this actually present or are you being fed a default".

It's hard for ts-proto to provide two of these capabilities without something like a prototype and hasOwnProperty. Which we can do (ts-proto originally worked that way even for non-optional values) but it's, dunno, just slightly different b/c then Object.keys doesn't have the email field in it (in this example) b/c it's set on the prototype. Object.keys and entries breaking is why I moved ts-proto off of this approach.

I just checked the Java and C++ tutorials on that page, and C++ also looks like it does the "getter is an empty string + separate hazzer method".

Can you link to generated code for proto2 optional in other dynamic languages? My guess is that they will all have getters and hazzers.

@ssilve1989
Copy link
Contributor

Oof 🤦 so i looked into this today and it turns out we had some patches to generators where I work that made it function as i described, but you are correct in your description of how the protobuf spec is intended to work.

@stephenh
Copy link
Owner

Lol, well, that explains why we were talking in circles. :-D

Fwiw I get why you're are doing it that way, I personally think it makes more sense...

But not sure if that means ts-proto should do it.

I suppose it could end up being another option flag.

@eyalpost
Copy link

Wanted to try and convince you to either make it optional indeed for proto2 optionals or at least provide a flag which does this. Indeed proto2 expects deserializers to return the default value when accessed directly (i.e. via getter) and add a hazzer function which checks whether the value was actually set. This is because proto2 has custom default values and these are implemented by the deserializer (which is why it can't return "undefined" if a value doesn't exist).
Because your generator creates plain ts interfaces and can not provide a hazzer I think that the correct thing to do it make those field ts options (i.e. ?) so the reader could actually differentiate between a set and unset value and it will be up to the reader to use the default value when the property isn't set.

@stephenh
Copy link
Owner

@eyalpost I agree the lack of hazzers is unfortunate; I've historically gone back/forth on "would clients prefer being able to detect not-present" or "would clients prefer not having to 'or' their own default value", and kind seems like at the end-of-the-day it's going to be on a case-by-case basis.

That said, I've also felt this pain in ts-proto itself, as the protobuf descriptors themselves have an optional int field oneof_index that is a) a proto2 file, b) an optional int field, and c) the values are 0-based, so if we read 0 with ts-proto's current approach, we just cannot tell "...is your index 0? or did you mean unset?".

If you'd like to submit a PR that changes the behavior out-right to index?: number, as per @philikon 's original request, we could merge it and see how many users request the old behavior, and then implement another flag. :-) 🤷

@deser
Copy link

deser commented Mar 11, 2022

@ssilve1989 can you share example how did you configure ts-proto with Bazel?
Thanks

@ssilve1989
Copy link
Contributor

@deser We wrote an aspect based on the one found here: https://github.com/bazelbuild/rules_nodejs/blob/4.x/packages/labs/grpc_web/ts_proto_library.bzl

although aspect-based proto compilation does not look to be the recommended way of doing this anymore so we will be switching over to use an implementation by stackb/rules_proto when it is implemented

@deser
Copy link

deser commented Mar 16, 2022

Thanks :)

@val-o
Copy link

val-o commented Jun 9, 2023

It would be great to have the same behavior as optionals in proto3, i.e. { scalarOptional: number | undefined }

@stephenh
Copy link
Owner

@val-o I'm traveling so don't have time to fully remember/study the issue, but iirc / kinda guessing that @Vilsol 's PR from last year fixes this:

https://github.com/stephenh/ts-proto/pull/484/files#diff-ad74198e9093dc9e981de421c0e333bb02e31091258228b27ec0dfe1932815a0R279

And we just forgot to close this issue out... @Vilsol does that sound right to you?

@slinkydeveloper
Copy link

As a data point, I stumbled on this recently while using the descriptor generated by this library with another proto library: andrewhickman/prost-reflect#49

@stephenh
Copy link
Owner

Fwiw, per an earlier update, I'm 95% sure this has been fixed b/c ts-protos own/internal ts-proto-descriptors library that we use for reading messages from the protoc compiler handles oneofIndex being an proto2 optional field via a hazzer check.

@slinkydeveloper
Copy link

@stephenh in my case I'm not reading the proto descriptor using this library, but I'm serializing the descriptor using this library and then consuming it from another library.

@stephenh
Copy link
Owner

Hm, @slinkydeveloper are you using the usePrototypeForDefaults option? The oneof_index in your bug report is exactly what we had to handle too, although now I'm wondering if we handled the reading correctly, but not the writing, i.e. our encode methods that turn messages to bytes may always be outputting oneof_index=0 even if the 0 is the default from the prototype...

Lmk if you're using usePrototypeForDefaults, and if so, yeah, we probably need a new issue to track writes not serializing the default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants