Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: codec: remove json-iterator and use official encoding/json #157

Closed
Andrew-M-C opened this issue Jan 13, 2024 · 16 comments
Closed
Labels
proposal New external API or other notable changes

Comments

@Andrew-M-C
Copy link
Contributor

The performance of official encoding/json. The saying that "official json is slow" is out-of-day. Besides, json-iterator use some low-level packages those may panic when the major version of go advances.

Json-iterator panic issues:

Json-iterator performance issues:

If you really have performance concern, please use sonic, NOT json-iterator. If not, choose the official one.

Besides, less dependency, less safety issues.

@WineChord
Copy link
Contributor

WineChord commented Jan 15, 2024

I modified and ran the benchmark they provided, here are the results (using go1.21.2):

[wineguo@VM-116-92-tencentos go-benchmark]$ go test -run ^$ -bench ^Bench.*Small$ . 
goos: linux
goarch: amd64
pkg: github.com/json-iterator/go-benchmark
cpu: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
BenchmarkJsonParserSmall-10              1560852               753.5 ns/op            80 B/op          4 allocs/op
BenchmarkJsnoiterPullSmall-10            2083582               584.3 ns/op            80 B/op         10 allocs/op
BenchmarkJsnoiterReflectSmall-10         1786041               669.3 ns/op           176 B/op          3 allocs/op
BenchmarkEncodingJsonStructSmall-10       402614              2826 ns/op             400 B/op          8 allocs/op
BenchmarkEasyJsonSmall-10                1602778               746.4 ns/op            64 B/op          2 allocs/op
PASS
ok      github.com/json-iterator/go-benchmark   8.763s
[wineguo@VM-116-92-tencentos go-benchmark]$ go test -run ^$ -bench ^Bench.*Medium$ . 
goos: linux
goarch: amd64
pkg: github.com/json-iterator/go-benchmark
cpu: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
BenchmarkDecodeStdStructMedium-10                  55962             21655 ns/op             512 B/op         12 allocs/op
BenchmarkEncodeStdStructMedium-10                1449410               834.3 ns/op           216 B/op          2 allocs/op
BenchmarkDecodeJsoniterStructMedium-10            207370              5563 ns/op             384 B/op         41 allocs/op
BenchmarkEncodeJsoniterStructMedium-10           1619971               734.6 ns/op           216 B/op          2 allocs/op
BenchmarkDecodeEasyJsonMedium-10                  107053             11232 ns/op             160 B/op          4 allocs/op
BenchmarkEncodeEasyJsonMedium-10                 2031669               597.1 ns/op           600 B/op          4 allocs/op
PASS
ok      github.com/json-iterator/go-benchmark   9.772s
[wineguo@VM-116-92-tencentos go-benchmark]$ go test -run ^$ -bench ^Bench.*Large$ . 
goos: linux
goarch: amd64
pkg: github.com/json-iterator/go-benchmark
cpu: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
BenchmarkJsonParserLarge-10                30417             39430 ns/op               0 B/op          0 allocs/op
BenchmarkJsoniterLarge-10                  13920             86405 ns/op           12496 B/op       1133 allocs/op
BenchmarkEncodingJsonLarge-10               4906            240958 ns/op             440 B/op         10 allocs/op
PASS
ok      github.com/json-iterator/go-benchmark   4.888s
[wineguo@VM-116-92-tencentos go-benchmark]$ 

The code is available at my fork: https://github.com/WineChord/go-benchmark/tree/to_mod

The argument regarding performance doesn't seem to be entirely convincing on its own.

The point about "less dependency, fewer safety issues" is valid, but it has been compromised for performance. The jsoniter was introduced by an avid contributor within our company for performance enhancement. This contributor proposed a trade-off between safety and performance. Now, @Andrew-M-C is suggesting we revert this trade-off. However, I question whether future developers might revert back to the current state for the same performance reasons.

It's important to note that users can always register and replace the json implementation. The framework's implementation merely provides a default version; it doesn't restrict users from using their preferred implementation.

Given this, the current implementation could be either jsoniter or std. Considering the framework's plug-in style, the std implementation might be more suitable. However, I anticipate that performance-focused contributors might reintroduce the same performance enhancements. In other words, maintaining the status quo might be the best solution. As the saying goes: "The more you code, the less you code". In such a contentious situation, I propose we leave things as they are.

PS. There is indeed a panic issue related to jsoniter for Go versions prior to Go1.18. However, this has been addressed in the current framework by updating the Go directive and depending on the latest jsoniter.

PPS. Sonic is not the solution as it only targets the amd64 architecture. It can be seen as a further sacrifice for performance, even more so than jsoniter. However, the framework still allows you to re-register any json implementation you prefer. From this perspective, I maintain that we don't need to change any code.

@WineChord
Copy link
Contributor

WineChord commented Jan 15, 2024

emmm... I've discovered another crucial point: compatibility.

codec.JSONAPI is exported and typed as jsoniter.API and is deeply tied to the JSONSerialization implementation.

PR158 would break compatibility.

@tocrafty
Copy link
Contributor

I concur that the standard JSON is much more maintainable than the JSON iterator. However, altering default behavior always carries risks. Once the JSON iterator is integrated into tRPC-Go, there is no opportunity to remove it. In this regard, I align with WineChord. Users should explicitly register their own official JSON if they expect a more stable JSON parser.

@Andrew-M-C
Copy link
Contributor Author

Andrew-M-C commented Jan 15, 2024

First of all, compatibility is also a concern.


Secondly, if you really worry about performance and decide not to use official one, please consider sonic, which is much more powerful than jsoniter.


Thirdly, the benchmark from jsoniter themself are designed to persuade programmers, which may be quite specified. that means, jsoniter is NOT ALWAYS efficient in ALL cases. We should allow programmer to replace default JSON serializer, but not choosing one third-party implementation by default. Besides, use std json package may avoid more risks, no matter safety issues, license issues, performance issues, etc.


Fourthly, the performance test by jsoniter can not cover all cases. Please look into the case provided by jsoniter, it is good at basic type unmarshalling (string, number, bool), and it intentionally sidestepped the benchmark of embeded object and array operation. But for practical programming, most of data will be given by objects with depth more than one.

Please refer to my benchmark, which covers object, array, and also embeded object and array. I had to admit that jsoniter is a bit faster than official json, but not significantly. If the rating of standard json is 10, jsoniter will be around 12~13. However, sonic may take 20!


All in all, we do not need jsoniter. For performance, we have greater alternation. For safety concern, jsoniter is unsafe.

This is why I raise this issue. I suggest using sonic or standard one, not jsoniter.

@liuzengh
Copy link
Contributor

Before the code was officially open-sourced, we did some refactoring, such as replacing the jsoniter dependency in the admin package with encoding/json. However, it seems that this work was not thorough enough. Due to compatibility issues, the proposed changes you are now suggesting may not be easily accepted. However, since trpc-go has only recently been open-sourced and is not widely dependent, the impact will be smaller.

@liuzengh
Copy link
Contributor

If there are no compatibility issues, I believe using encoding/json is a better choice.

@WineChord
Copy link
Contributor

WineChord commented Jan 15, 2024

Given that the implementation can be modified by re-registering, I admit that performance is a minor point.

Compatibility is a completely different story; we have witnessed countless complaints about the incompatibility introduced by some patches. A long time ago, we relaxed the standard and allowed them to merge. Then it is the maintainers who suffer all the blame. Raising a PR is always welcome, but merging one is a hard decision. It can be easy, but it will be a hard time for the maintainer. The PR authors are always happy with their PR being merged, but all the aftermath is left for maintainers. If users upgrade their code and find that they changed nothing but the code breaks, they will blame the maintainers, the approvers, never the one who raised the PR.

Take this exported codec.JSONAPI as an example; it first appeared three years ago (July 16th, 2020, 1:59 PM), with jsoniter being introduced four years ago (August 16th, 2019, 3:12 PM). Thereafter, it is left as it is. According to Hyrum's Law:

With a sufficient number of users of an API,
it does not matter what you promise in the contract:
all observable behaviors of your system
will be depended on by somebody.

This exported API demonstrates no difference. We cannot risk any compatibility issue to let this happen again, with so many cases already happening. The former compatibility-breaking cases have been gathered as 小作文, which can be more exquisite than the one written by Meizhu. I kindly beg you not to let this PR be the next one presented on the list.

You may say that trpc-group/trpc-go is an open-source project presented on GitHub, there is no history burden. Two points to address that:

  1. There are users who are directly switching from the inner repo to the GitHub one; they just want to get work done, instead of reading some "When you are switching trpc-go to GitHub, here is the long list you need to do. First, re-register JSON if you need the original one, or change the way you are using the codec.JSONAPI, as the real type has changed. Second, ..."
  2. Maintenance effort. We are limited by human resources, and the difference between the inner repo and the GitHub one should be minimized to compensate for that. Indeed, trpc-go has a lot of designs that can be improved, and a few we tried, break compatibility in the hard way, which became our nightmare. Since that time, we are always going to ask each and every PR: "Does this break compatibility?" If it does, it needs rethinking and be withdrawn. During the release of the GitHub version, we also made some improvements to make it look somewhat different from the inner repo. But still, some compatibility is broken, and soon we taste the bitter fruit of our own actions, and we kindly invited back those APIs like a down-and-out mouse. We all want to be cute cats, not 鼠鼠, right? >.< So, sincerely, I wish this will not be happening again, and I cordially invite you to contemplate the matter from the vantage point of a maintainer, so as to appreciate the challenges we face.

Thank you.

@Andrew-M-C
Copy link
Contributor Author

Hhhh I agree that it is so hard for maintainers @WineChord. So this is just a proposal. But I still, strongly suggest sonic.
However, as alternative, perhaps this PR is more acceptable?

@Andrew-M-C
Copy link
Contributor Author

Hhhh I agree that it is so hard for maintainers @WineChord. So this is just a proposal. But I still, strongly suggest sonic. However, as alternative, perhaps this PR is more acceptable?

The new PR allows user (like me) to use sonic instead, of which users should take responsable themself.

@WineChord
Copy link
Contributor

@Andrew-M-C I still have the question: Are you directly using the codec.JSONSerialization? Since I found no direct usage by the framework itself(not registered). So are you using it explicitly?

@Andrew-M-C
Copy link
Contributor Author

Andrew-M-C commented Jan 16, 2024

hhhh. I am afraid the discussion of this issue had gone out its original purpose.

Please allow me to explain how I find my problem and why I raised this issue. And make a briefing of my final opinion.


At the beginning, we noticed that our int64 field was always marshaled into string value. We studied and found out that it was because trpc use jsonpb as its default json serializer. So we re-register our own serializer with standard JSON.


And then I noticed the global variable JSONAPI, which is referenced by type JSONSerialization, which is registered in codec's init() function.

To be honest, I could not understand why definding JSONAPI. If users want to customize theirs own serializer, invoking codec.RegisterSerializer will do. At first I thought it was meant to provide some simple way to override the default JSON serializer. by doing this:

codec.RegisterSerializer(xxxxJSON, codec.JSONAPI)

So here came to my first proposal: jsoniterator is not safe, we should put encoding/json back.


After discussion with you guys, I finally realized your concern and I accepted that my two PRs are not the best choice.

I fully agree with @tocrafty: "Users should explicitly register their own official JSON if they expect a more stable JSON parser"


In the end, here is my final proposal (I hope this time is trully the final😂):

  1. Not making any logical changes
  2. Add "Deprecated" mark over JSONAPI, and leave tocrafty's note I mentioned above in the comment of JSONAPI

What do you think?

@WineChord
Copy link
Contributor

I think I've got your point. In your scenario, re-registering to shadow the default jsonpb has fulfilled your requirements.

The existing JSONAPI is something you have observed to be incorrect, and you want to change it, not because it's necessary for your needs, but in the pursuit of accuracy and elegance. However, this leads to the compatibility dilemma we've previously discussed.

I agree with the approach of adding comments. These comments should encapsulate the key ideas we've exhaustively discussed up to this point. For instance, they should explain that the exported method is not technically correct, but it cannot be removed due to compatibility issues. Furthermore, the comments should provide the correct approach, which involves re-registration.

@Andrew-M-C
Copy link
Contributor Author

I have closed both of my PRs. As for comments I mentioned, do you need another PR? Or simply add one in your later version? 😄

@WineChord
Copy link
Contributor

WineChord commented Jan 19, 2024

@Andrew-M-C Feel free to raise a PR for it 😄

@Andrew-M-C
Copy link
Contributor Author

I have created another PR

Andrew-M-C added a commit to Andrew-M-C/trpc-go that referenced this issue Jan 23, 2024
sandyskies pushed a commit that referenced this issue Jan 25, 2024
…be modified (#157) (#160)

docs: add comments informing that global variable JSONAPI should not be
modified (#157)

Discussions took place in #157
@Andrew-M-C
Copy link
Contributor Author

PR merged, thank you all!

sandyskies added a commit that referenced this issue May 16, 2024
### Features

- {client, naming}: allow selector to define its own net.Addr parser
(#176)
- log: provide registration for custom format encoder (#146)

### Bug Fixes

- attachment: fix possible uint32 overflows (#161)
- attachment: copy attachment size if user provides their own rsp head
(#161)
- stream: fix the memory leak issue that occurs when stream.NewStream
fails (#161)
- errs: Msg should unwrap the inner trpc error (#161)
- http: use GotConn for obtaining remote addr in connection reuse
case(#161)
- http: http trace should not replace req ctx with transport ctx (#161)
- http: do not ignore server no response error (#161)
- restful: fix timeout does not take effect which is introduced in !1461
(#161)
- log: skip buffer and write directly when data packet exceeds expected
size( #161)
- config: set empty ip to default 0.0.0.0 to avoid graceful restart
error (#161)
- config: fix watch callback leak when call TrpcConfigLoader.Load
multiple times (#161)
- server: fix unaligned 64-bit atomic words at linux-386 (#161)
- server: don't wait entire service timeout, but close quickly on no
active request (#161)
- server: do not close old listener immediately after hot restart (#161)
- config: promise that dst of codec.Unmarshal is always
map[string]interface{} (#161)
- restful: fix that deep wildcard matches in reverse order (#161)
- log: log.Info("a","b") print "a b" instead of "ab" (#161)
- stream: return an error when receiving an unexpected frame type (#161)
- stream: ensure server returns an error when connection is closed
(#161)
- stream: fix connection overwriting when a client uses the same port to
connect.(#161)
- stream: fix client's compression type setting not working (#161)
- client: remove the write operation on *registry.Node in LoadNodeConfig
to avoid data race during selecting Node (#161)
- config: re-enable Config.Global.LocalIP to perfect !1936 (#161)
- http: get serializer should also be able to unmarshal nested structure
(#161)
- http: check type of url.Values for form serialization (#161)
- http: expose possible io.Writer interface for http response body
(#161)
- client: fix client wildcard match for config (#161)
- codec: revert !2059 "optimize performance of extracting method name
out of rpc name" (#161)
- http: fix form serialization panicking for compatibility (#161)

### Documentations

- docs: add note on listen to all addresses for a service (#144) 
- docs: refine idle timeout settings documentation for clarity (#172)
- docs: add notes on client connpool idletimeout (#170)
- pool: fix typos (#167)
- docs: add notes on service idle timeout (#166)
- docs: correct the spelling error (#163)
docs: add comments informing that global variable JSONAPI should not be
modified (#157)

### Enhancements

- github-action: allow dependabot to contribute and bump cla to v2.3.2
(#174)
- {trpc, examples, test}: upgrade google.golang.org/protobuf v1.30.0 =>
v1.33.0 (#171)
- http: improve stability of http test (#168)
- go.mod: upgrade tnet version to handle negative idle timeout (#169)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal New external API or other notable changes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants