-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a different JSON library #47
Comments
When genqlient generates output types, it generates whatever code is necessary to unmarshal them. Conversely, when it generates input types, it generates whatever code is necessary to marshal. This is all that's needed for genqlient itself: it never needs to marshal output types or unmarshal input types. But maybe you do! (For example, to put the responses in a cache, which is the use case that @csilvers hit at Khan, although there are others one can imagine.) While we can't support every serialization format you might want (at least not without adding plugins or some such), it's not unreasonable to expect that since genqlient can read JSON, it can write it too. Sadly, in the past this was not true for types requiring custom unmarshaling logic, for several reasons. In this commit I implement logic to always write both marshalers and unmarshalers whenever they're needed to be able to correctly round-trip the types, even though genqlient doesn't do so. I wasn't starting from scratch, since of course we already write both marshalers and unmarshalers in some cases. But this ended up requiring surprisingly large changes on the marshaling side, mostly to correctly support embedding (which we use for named fragments). Specifically, as the comments in `types.go` discuss, the most difficult issue is spreads with duplicate fields, which translate to Go embedded fields which end up hidden from the json-marshaler. Ultimately, I had to do things quite differently from unmarshaling, and essentially flatten the type when we write marshaler. But in the end it's not so ugly -- indeed arguably it's cleaner! Mainly it's just different. In general, I begin to wonder whether using `encoding/json` at all is really right for genqlient: we're doing a lot of work to appease it, despite knowing what our types look like. I think it would still be a significant increase in lines of code to roll our own, but that code would perhaps be simpler, and would surely be faster (although if we just want the speed gains we could use another JSON-generator library, see also #47). Anyway, something to think about in the future. Test plan: make tesc Reviewers: marksandstrom, steve, adam, miguel, mahtab, jvoll
As mentioned in #120, another option is to roll our own, making use of what we know about our types. I suspect it's not worth it yet, but if our (un)marshalers start having to get even more complicated, it may be, since we end up doing a lot of work to get |
go-codec also supports codegen for json (and other encoding formats as well) I believe, via |
ugorji/go-codec is worth adding to the list above. It is described here.
I hadn't mentioned it before, because I'm not sure that anything but json is relevant to us, and so it's possible part of ugorji/go popularity is only due to non-relevant formats. However, if it's well maintained, and has the right features, it's worth considering.
I like to use popularity as a proxy for package quality, but comparing other things like contribution rate is a little more work. |
I've updated the issue description to add the wider range of considerations we've added here, and some notes from a quick look at a few of the options. |
## Summary: When genqlient generates output types, it generates whatever code is necessary to unmarshal them. Conversely, when it generates input types, it generates whatever code is necessary to marshal. This is all that's needed for genqlient itself: it never needs to marshal output types or unmarshal input types. But maybe you do! (For example, to put the responses in a cache, which is the use case that @csilvers hit at Khan, although there are others one can imagine.) While we can't support every serialization format you might want (at least not without adding plugins or some such), it's not unreasonable to expect that since genqlient can read JSON, it can write it too. Sadly, in the past this was not true for types requiring custom unmarshaling logic, for several reasons. In this commit I implement logic to always write both marshalers and unmarshalers whenever they're needed to be able to correctly round-trip the types, even though genqlient doesn't do so. I wasn't starting from scratch, since of course we already write both marshalers and unmarshalers in some cases. But this ended up requiring surprisingly large changes on the marshaling side, mostly to correctly support embedding (which we use for named fragments). Specifically, as the comments in `types.go` discuss, the most difficult issue is spreads with duplicate fields, which translate to Go embedded fields which end up hidden from the json-marshaler. Ultimately, I had to do things quite differently from unmarshaling, and essentially flatten the type when we write marshaler. But in the end it's not so ugly -- indeed arguably it's cleaner! Mainly it's just different. One thing to note is that we do marshal `__typename` based on what we know about the types; users need not fill it in (and if they do we'll ignore it). This seemed to me to be a better UX, and didn't add much complexity. In general, I begin to wonder whether using `encoding/json` at all is really right for genqlient: we're doing a lot of work to appease it, despite knowing what our types look like. I think it would still be a significant increase in lines of code to roll our own, but that code would perhaps be simpler, and would surely be faster (although if we just want the speed gains we could use another JSON-generator library, see also #47). Anyway, something to think about in the future. ## Test plan: make tesc Author: benjaminjkraft Reviewers: csilvers, StevenACoffman, benjaminjkraft, dnerdy, aberkan, jvoll, mahtabsabet, MiguelCastillo Required Reviewers: Approved By: StevenACoffman, dnerdy Checks: ✅ Test (1.17), ✅ Test (1.16), ✅ Test (1.15), ✅ Test (1.14), ✅ Lint, ✅ Test (1.17), ✅ Test (1.16), ✅ Test (1.15), ✅ Test (1.14), ✅ Lint Pull Request URL: #120
To do: take a look at how |
It's unfortunate that If using CBOR becomes an option for you then also see Extended Diagnostic Notation (it represents binary CBOR data as JSON with extensions). DISCLAIMER: I contribute docs & ci to fxamacker/cbor. It's secure, fast, memory efficient, and has deterministic encoding. |
@x448 it sounds like the library you describe implements some non-JSON encoding, no? We have to use JSON because it's what GraphQL servers use. |
Quick notes on reading the
|
There are a few potential ideas here, which may coordinate or conflict.
First, any code that's using genqlient should be a great match for one of the libraries that does JSON via codegen rather than reflection, which are theoretically way better for perf. @StevenACoffman and @csilvers pointed me to easyjson, ffjson, go-codec,
jsonparser,and gojay, and Steve posted some more details about them below; I'd also want to run our own benchmarks on actual genqlient-generated code.On the other hand, right now we do a lot of work to try to do what we want within the
encoding/json
API, so it might instead be better to use a library that has a different set of extension points more suitable to our needs. Steve pointed me to mapstructure which is definitely aimed at what we are trying to do in the case of abstract types and custom unmarshalers, which are the most complex case on the decoding side. Sadly it looks like it has somewhat limited support for encoding (see mitchellh/mapstructure#166), and while it has more flexible support for embedding than the standard library, it will only simplify our unmarshalers (by making it easier to merge embedded structs), not eliminate them.Switching to mapstructure would also preclude switching to one of the other code-gen libraries, which tend to be compatible with
encoding/json
. Some do add additional options, so we could explore that. (Of those linked above, go-codec seems to have the most extension points, and would likely have similar benefits to mapstructure, although like mapstructure its support for embedding is likely not flexible enough to solve all our problems. It also has support for other encodings, which could be useful for e.g. putting your responses in a cache, or if you use some strange transport for your GraphQL (not likely). Note that it's possible golang/go#5901 will add some of the features we'd like to the standard library, although the timeline on that is very unclear to me.)One potential future need comes from #30; which for fields will require special handling in (un)marshaling to (un)marshal several layers of JSON structure into/from one layer of Go structure. I didn't see support from any of the libraries for that, although mapstructure might at least provide better tools for us to do it ourselves, or one could imagine any of the libraries might add it (essentially what we want is support for a tag
json:"a.b.c"
). Another potential future need is more options for how to handleomitempty
; right now we match JSON because I think that's the least-bad default (see #103) but it's got some problems for sure. (And now we probably do want to try to keep that behavior, at least as the default, for better or worse, which could be tricky for libraries that have different behavior.)Finally, as mentioned in #120, another option is to roll our own, making use of what we know about our types to simplify things. I suspect it's not worth it yet, but if our (un)marshalers start having to get even more complicated, it may be; we don't need to implement nearly all of the behavior of the general-purpose libraries since we know what we want our types to look like.
For any of the third-party codegen options, we do need to make sure this won't cause trouble with custom scalars: it might be a little surprising if you have to rerun genqlient because you changed a json tag in a custom scalar, although it's not a huge problem in practice I expect (especially since probably most custom scalars will have
UnmarshalJSON
methods). In any case we could also put this behind a flag. A first-party implementation could call out toencoding/json
for custom scalars, of course.The text was updated successfully, but these errors were encountered: