-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add EachKey #45
Add EachKey #45
Conversation
Initially this change was part of struct unmarshaling PR #29, but it became too big, and there were too many optimizations/fixes in master. Will be merging this functionality separately. |
Hello Leonid, any news when this will be merged? In my case I have lots of different JSON requests with several fields on top level so this helper will allow to prevent parsing the whole content several times. |
Well, initially I did not want to merge it because thought I will find better API. I do not have too much time right now as I fully focused on http://github.com/buger/gor right now, but if you think this API make sense, I may consider merging it as it is. |
Ah, ok, then let me first experiment with this patch first to understand if this is really comfortable to work with - maybe some ideas arise during work. I'll provide a feedback as soon as I get my hands dirty into this. |
I played a bit with this. It was pretty comfortable to use. Here is struct I have: type publishAPICommand struct {
Channel Channel `json:"channel"`
Client ConnID `json:"client"`
Data json.RawMessage `json:"data"`
} Here is benchmark code: var publishKeys = [][]string{
[]string{"channel"},
[]string{"data"},
[]string{"client"},
}
var publishData = []byte(`{"channel": "1", "data": {"input": "test"}, "client": "xxx"}`)
func BenchmarkPublishUnmarshal(b *testing.B) {
for i := 0; i < b.N; i++ {
var cmd publishAPICommand
err := json.Unmarshal(publishData, &cmd)
if err != nil {
panic(err)
}
}
}
func BenchmarkPublishUnmarshalJsonparser(b *testing.B) {
for i := 0; i < b.N; i++ {
var cmd publishAPICommand
var parseError error
jsonparser.EachKey(publishData, func(idx int, value []byte, vType jsonparser.ValueType, err error) {
switch idx {
case 0: // channel
if err != nil {
parseError = err
return
}
if vType != jsonparser.String {
parseError = errors.New("channel must be string")
return
}
cmd.Channel = Channel(value)
case 1: // data
if err != nil {
parseError = err
return
}
if vType == jsonparser.Null {
cmd.Data = []byte("null")
} else if vType == jsonparser.String {
cmd.Data = []byte(fmt.Sprintf("\"%s\"", string(value)))
} else {
cmd.Data = value
}
case 2: // client
if err != nil {
parseError = err
return
}
if vType != jsonparser.String {
parseError = errors.New("client must be string")
return
}
cmd.Client = ConnID(value)
}
}, publishKeys...)
if parseError != nil {
panic(parseError)
}
}
} Benchmark results:
Pretty fast! As I said It was comfortable to use, I can handle unmarshal errors too, maybe the only thing is var publishKeys = map[string][]string{
"channel": []string{"channel"},
"data": []string{"data"},
"client": []string{"client"},
}
jsonparser.EachKey(publishData, func(idx string, value []byte, vType jsonparser.ValueType, err error) {
switch idx {
case "channel":
...
} Not sure about this though - i.e. less typing in one place but more typing in another. But allows to extend keys description without worrying about indexes - just give a proper name. Also a question - look how I handle |
# Conflicts: # parser_test.go
It uses indexes instead of keys because you can query nested keys: Regarding RawMessage, this is indeed the custom case, and also you don't know the key type, so I think your way of handling it is entirely ok. Merging it. Thank you for the testing! |
Thanks! Btw I also tried to unmarshal data containing crazy JSON payload as RawMessage (the craziest I could find in internet) - here it is https://github.com/FZambia/easyerror, |
Well, this is probably because jsonparser a bit simpler and does not really fully validate JSON, which is both good and bad, depending on your case :) |
Description:
Fetch multiple keys at once. Instead of reading whole payload for each key, it does read only 1 time.
Using new API you can get nearly 2x improvement when fetching multiple keys, and the more keys you fetch, the bigger difference.
Example:
Benchmark before change:
Added 2 new benchmarks: