Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support proper JSON for Value and Duration #1495

Closed
wants to merge 7 commits into from

Conversation

alexander-fenster
Copy link
Contributor

Fixes #839.

This PR adds support for proper JSON encoding for google.protobuf.Value, google.protobuf.Struct, and google.protobuf.Duration. The "proper JSON" is described in the corresponding protos: duration, struct, and here.

To make sure this change does not break anyone, the modifications to .toObject are hidden under the new undocumented option values: true. The problem here is that we already have an undocumented option json: true that changes the behavior for google.protobuf.Any, so I cannot safely reuse it without a possibility of breaking users' code.

So, while we're in 6.x, use {json: true, values: true} to make .toObject generate proper JSON representations for these types. In 7.x I plan to remove values and properly document json (and probably make it enabled by default - this needs to be discussed).

Note that .fromObject is safe to be updated since it will now work in cases where it previously failed (e.g. when the duration of "3s" is passed instead of an object with seconds and nanos), and I consider this to be a feature and not a breaking change.

See also: #1258 - after I spent some time working on this one, I think I will be able to properly review that one too :) (cc: @johncsnyder - thanks for your PR, we'll finally get it merged soon!)

src/wrappers.js Outdated
var Value = root.lookup("google.protobuf.Value");

for (; i < names.length; ++i) {
fields[names[i]] = googleProtobufValueFromObject(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just make this fields[names[i]] = Value.fromObject(object[names[i]])? That should be properly delegated to the fromObject wrapper here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thank you! (changed both here and in toObject)

Copy link
Contributor

@murgatroid99 murgatroid99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the heuristics in the Value and Struct fromObject wrappers for interpreting the input differently. With those functions the idea is that you can take an arbitrary JSON object and convert it to a protobuf message, but those heuristics create bizarre and potentially confusing exceptions. I think it would be better to either choose the code path based on the value of the flags, like with toObject, or have separate methods.

@alexander-fenster
Copy link
Contributor Author

@murgatroid99 The heuristics are there because it will be a breaking change without them - .fromObject for google.protobuf.Struct will behave differently if passed a representation of google.protobuf.Struct.

Unfortunately, .fromObject does not accept any conversion options that would've helped here. Making it accept an options object and pass it down to all fields won't be too hard, but I don't really like this idea: it will only make things more complicated and harder to support.

I feel that probably the better alternative is to remove heuristics and mark this as a breaking change for v7.0. At the same time it will allow us drop the values option for toObject, leaving just the json option. Given that #1234 was recently updated and now passes CI, v7.0 might be not too far away.

What do you think?

@murgatroid99
Copy link
Contributor

Just removing the heuristics has its own problems. If you do that, it's impossible to convert from the direct object representation of a Struct to a Struct message.

@alexander-fenster
Copy link
Contributor Author

it's impossible to convert from the direct object representation of a Struct to a Struct message.

True. And that will match the experience in other languages. The spec explains how different types must be encoded in JSON. Here, protobuf.js v6 (and earlier) violates the spec, while I plan to make protobuf.js v7 compatible with the spec.

Example in Python: given this test.proto:

syntax = "proto3";
import "google/protobuf/struct.proto";
package test;
message Test {
  google.protobuf.Struct struct = 1;
}

and this test code:

from google.protobuf.json_format import MessageToJson, Parse
from google.protobuf.struct_pb2 import Struct, Value
from test_pb2 import Test

value = Value(string_value="test")
fields = {"key": value}
struct = Struct(fields=fields)
message = Test(struct=struct)

# writes JSON
with open("test.json", "w") as f:
 f.write(MessageToJson(message) + "\n")

# reads JSON
with open("test.json", "r") as f:
  json_data = f.read()
  parsed = Parse(json_data, Test())
  print(parsed)

it will write and then successfully read the following JSON representation of google.protobuf.Struct:

{
  "struct": {
    "key": "test"
  }
}

which corresponds to the following internal structure:

struct {
  fields {
    key: "key"
    value {
      string_value: "test"
    }
  }
}

Now, if you try to pass the direct representation to this code:

{
  "struct": {
    "fields": {
      "key": {
        "stringValue": "test"
      }
    }
  }
}

the Python code won't read it properly, instead getting this:

struct {
  fields {
    key: "fields"
    value {
      struct_value {
        fields {
          key: "key"
          value {
            struct_value {
              fields {
                key: "stringValue"
                value {
                  string_value: "test"
                }
              }
            }
          }
        }
      }
    }
  }
}

So basically, I'm not aware of a way to read the direct JSON representation of google.protobuf.Struct in other languages, and I don't see why protobuf.js must be different here.

The only concern here is the compatibility. We definitely cannot drop the ability to read the direct representation without bumping the major, so - as I see it - we either go to v7 dropping the ability to read the direct representation with .fromObject() (other ways will still be possible - e.g. passing the direct representation directly to Struct.create()), or we keep using the heuristics in v6 in hope that no special case will arise and the majority of users will be happy.

The third alternative would be to provide an option for .fromObject(), but as I said, I dislike this idea for several reasons:

  • this is something very JS-specific, this option will move us away from compatibility with other languages;
  • this defeats the purpose of .fromObject() which is, as I see it, to be able to read and understand whatever is given as an input;
  • this is a change to all the generated code.

What do you think?

@murgatroid99
Copy link
Contributor

JavaScript objects are not JSON. fromObject does not claim to take the JSON representation as an input, and aside from the google.protobuf.Any handling, it does not currently attempt to do so. I do not believe we should dramatically change its functionality by making it only accept the JSON representation of Struct as input. That wouldn't even be consistent with the other well known types.

I think the right approach here may be to provide a separate fromJSON function that only accepts the JSON representation of a message as input. That resolves the ambiguity we have here, and it can be less lenient than fromObject is in some places.

@alexander-fenster
Copy link
Contributor Author

OK, that sounds like a good solution - I will look at the code to see if I can implement it this way. I suppose the JSON functionality for google.protobuf.Any will also need to be moved to .fromJSON within the next semver major bump.


// otherwise, it's a JSON representation as described in google/protobuf/struct.proto
var self = this;
var message = googleProtobufValueFromObject(object, function(obj) { return self.create(obj); });

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe create() is a static method--is that correct?

nullValue: 0
});
}
if (typeof object === "number") {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is missing boolValue.

// something that has just one property called stringValue, listValue, etc.,
// and possibly a property called kind, is likely an object we don't want to touch
var names = Object.keys(object);
if (names.length === 1 && names.match(/^(?:null|number|string|list|struct)Value$/) ||

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are cases where names.match() will be called on an Array object that doesn't have the match() method. For example, you might have an obj value that is { content: text } that results in a names array with a length of 1 (but no match() method).

Maybe add an undefined check here?

if (names.length === 1 && names.match && names.match(...) ...

@danielhochman
Copy link

danielhochman commented Mar 11, 2021

I think Timestamp belongs in this PR as well since it has a string JSON representation despite being a Message with nested fields in its proto definition. I would be happy to add it as a followup if this is merged at some point without it.

timestamp.proto JSON mapping specification

@alexander-fenster
Copy link
Contributor Author

alexander-fenster commented Aug 4, 2021

Folks, for those who follows this PR and, in general, follows the topic of proper proto3 JSON serialization/deserialization according to the protobuf spec, I wanted to post a quick update here. For reasons not directly related to anything in this PR, I won't be able to complete it here soon, so I released the proto3 JSON serialization/deserialization code as a separate npm package.

$ npm install proto3-json-serializer

At this moment, we only support instances of protobuf.Message which should be enough for most use cases. Feel free to try it and create issues in that repository in case if you need anything to be fixed; contributions are also welcome :)

This separate package will cover the workflows I need at this time; if I have some free time to backport its functionality directly to protobuf.js (e.g. by generating .fromProto3JSON / .toProto3JSON together with .fromObject / .toObject), I'll send a separate pull request here, but I cannot promise any ETA on that.

@nicain
Copy link

nicain commented Oct 7, 2021

This PR is blocking googleapis/nodejs-ai-platform#28 @alexander-fenster Is there an owner for getting this merged?

@alexander-fenster
Copy link
Contributor Author

@nicain This PR is not going to be merged because the original approach - to make proto3 JSON (de-)serialization using the existing fromObject / toObject method - is probably not what we want. I suggest that we (and everyone else) use the separate proto3-json-serializer NPM module that I wrote for this purpose, that way it will help produce compliant JSON without breaking or complicating the existing protobuf.js code.

Let's continue the discussion in the PR you mentioned.

@ideasculptor
Copy link

Forcing the use of a package for json serialization doesn't make much sense to me. Generally, I don't care about converting protobufs TO json. However, I do want to be able to take a javascript object that has the same structure as a protobuf schema and use Type.fromObject(msg) (or Type.create(msg) but that seems like asking a lot) to get a protobuf - even if a date field has a Date() in it instead of { seconds: x, nanos: y }

Is there some way to set that up? It's really not clear from reading this, though I have yet to delve into the source code to see what I can understand about the wrappers functionality and dynamically adding wrappers.

At the moment, I have schemaless code that is using plain javascript objects which is migrating to protobufs. We were able to define compatible protos for 100% of our existing messages except that we used Timestamp for date fields. The result is that we cannot serialize those date fields without first changing them out to { seconds: x, nanos: y }, which is quite the pain in the neck. Is there some way to automate that conversion? I don't care about the deserialization side as our consumers of serialized protos are all protobuf-aware and never require conversion back to plain javascript objects.

@dcodeIO dcodeIO deleted the value-duration branch April 7, 2022 04:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Well-known types support (Struct, Value)
7 participants