Broken Decoding of nil map value in map #208

ajwerner · 2017-10-05T04:37:37Z

54210f4
Seems to have broken decoding of nil map values. They are now omitted.

The below test demonstrates this behavior. In the prior commit it passes.

func TestMsgpackDecodeNilMapValue(t *testing.T) {
        type Struct struct {
                Field map[uint16]map[uint32]struct{}
        }
        toEncode := Struct{Field: map[uint16]map[uint32]struct{}{
                1: nil,
        }}
        handle := &MsgpackHandle{}
        handle.MapType = reflect.TypeOf(map[interface{}]interface{}(nil))
        buf := bytes.NewBuffer(nil)
        enc := NewEncoder(buf, handle)
        if err := enc.Encode(toEncode); err != nil {
                t.Fatalf("Failed to encode struct: %v", err)
        }
        dec := NewDecoder(bytes.NewReader(buf.Bytes()), handle)
        var decoded Struct
        if err := dec.Decode(&decoded); err != nil {
                t.Fatalf("Failed to decode struct: %v", err)
        }
        if !reflect.DeepEqual(decoded, toEncode) {
                t.Fatalf("Decoded value %#v != %#v", decoded, toEncode)
        }
}

The text was updated successfully, but these errors were encountered:

ajwerner · 2017-10-05T04:38:23Z

https://github.com/ugorji/go/blob/master/codec/codec_test.go#L1753

seems like the relevant testing TODO

Along with these changes, we support go 1.4+ across the board, and test across the last 3 go releases (go 1.7+) and tip before each github push. The changes articulated below are contained in this push: expand support for careful contained use of unsafe for performance This mostly involved creating unsafe and safe versions of a few functions, and using them across the codebase. - []byte-->string or string-->[]byte when we know there is no write done. e.g. map lookup, looking up a struct field given a field name, etc - rt2id(...): getting an id for a type usable as an id - rv2i(...): getting an interface from a reflect.Value in our narrow context for encoding/decoding - ptrToRvMap: caching reflect.Value got for the same pointers - using atomic pointers to reduce sync.RWMutex contention (lock-free model) - fast conversion from reflect.Value to a pointer when we know exactly what the type is. Note that we only support unsafe for the 3 most recent go releases. This is because use of unsafe requires intimate knowledge of implementation details. Simplify codebase during decode, making things consistent and improving performance - Use single conditional to remove duplicate codes for decoding into maps, arrays, slices and chans - Make changes across: reflection, fast-path and codecgen modes Litany of changes - prepare safe and unsafe variants for *decFnInfo methods - clean up struct for decoding into structs and collections (maps, slices, arrays, chans) so they are consistent in the reflection path. Previously, i had separate paths for length-prefix vs separator-based collections. - Update fast-path and codecgen template generation scripts to make all code mirror themselves - Apply some optimizations along the way (specifically for decode) - removed some unnecessary reflection calls in kMap - add some CanSet() checks and reuse reflect.Value (if immutablekind) during decode use safe and appengine tags to judiciously use "unsafe" by default for performance The tags are set up this way, so that a user must explicitly pass one of the following tags, to jse the standard "slower" apis. json: use lookup table for safe characters to improve performance The previous check for characters inside of a JSON string that needed to be escaped performed seven different boolean comparisons before determining that a ASCII character did not need to be escaped. Most characters do not need to be escaped, so this check can be done in a more performant way *typically* using a single/simple memory table lookup. We use [128]bool to represent safe and unsafe characters, and look up there instead of doing a computation. use sync/atomic to lookup typeinfo without synchronization Also, move from using a map to just do a binary search when looking up typeInfo. This should be faster (arguably). optimize for map with string keys to avoid allocation if possible re-introduce decDriver.DecodeStringAsBytes This reduces allocation significantly and is clearer. DecodeString and DecodeStringAsBytes are 2 versions of the same code. add native helper functions for performant json parsing improved json interaction with codec package for better performance. One way to improve json performance is to bypass the multiple readn1 calls in a tight loop, and just have a single interface call to do the combo functionality. json does some things a lot during decoding: - skip whitespace characters - constructs numbers - read strings doing these one character at a time incur interface indirection and indirect function call overhead, as it all goes through decDriver interface to an implementation like *bytesDecDriver. We can make some major gains by exposing "combo-like" methods that expose this functionality as a single call (as opposed to calls that do one byte at a time). // skip will skip any byte that matches, and return the first non-matching byte skip(accept *[256]bool) (token byte) // readTo will read any byte that matches, stopping once no-longer matching. readTo(in []byte, accept *[256]bool) (out []byte) // readUntil will read, only stopping once it matches the 'stop' byte. readUntil(in []byte, stop byte) (out []byte) Currently, only json handle has use of these methods, giving it much better perf. In addition, since we can just request a block of text, and work on it directly. we might as well remove the logic that does strconv.ParseFloat at iterator time. Instead, we just grab the block of text representing a number, and pass that to the strconv.ParseXXX functions. This resulted in the removal of all the string->number logic. Which is A-ok. passing through interface calls to an implementation (function and indirection). We saw performance gains when using [256]bool arrays in place of multiple comparisons. We continue to leverage that pattern to get some performance improvements without the need to pass callback functions for matching bytes. This is good because the call-backs were costing us over 10% in performance loss. Performance gains are very good now over the last few code iterations. re-packaged all language-specific code in appropriately named files json: add PreferFloat option to default to decoding numbers at float test suite and organization We now use test suites to define and run our tests, making most of what tests.sh and bench.sh did unnecessary. We now drive scripts from run.sh, and those just call go test ... for running tests and benchmarks. This required some re-factoring in the tests to make them all work. organize tests so they depend on helper functions, allowing for many suites. This affords XSuite, CodecSuite and AllSuite . It also allows us separate the tests from benchmarks, and further separate benchmarks into codec+stdlib only, and external ones. codecgen: remove unsafe - delegating to codec's safe/unsafe instead. codecgen will now use whatever tag (safe or unsafe) the codec package is built with. To do this, expose stringView to codecgen (as genHelperDecoder.StringView(...) and use that within the generated code. conditional build: to support go1.4+ support go1.4+ and use goversion_* to conditionally compile things based on go version codecgen: gen.go is only used by codecgen execution, behind build tag "codecgen.exec" Do this by - adding the build tag "codecgen.exec" to gen.go - moving the definition of GenVersion to gen-helper.go.tmpl (which is used at runtime) which requires adding a "Version" field to genInternal struct - passing the "codecgen.exec" tag as one of the values to "go run ..." Also, codecgen depends on fastpath support. Document that, and update run.sh appropriately. Also, values_codecgen_generated_test.go MUST not have the 'x' build tag set. That 'x' build tag is only for codebases which are not codec based. Code coverage is currently at 71%, which is just above our goals. ---------- Testing - expand list of float values to test - add a true large string for testing - add testUseIOEnc support to json-iterator benchmark - add a test suite for running all codec tests and grabbing codecoverage metrics - add test for json indent - exercise all paths in fast-path using full-featured generated type Misc: - add json-iterator and easyjson to the benchmarks

ugorji · 2017-10-11T09:16:15Z

Fixed via 50530d8

ugorji closed this as completed Oct 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Broken Decoding of nil map value in map #208

Broken Decoding of nil map value in map #208

ajwerner commented Oct 5, 2017

ajwerner commented Oct 5, 2017

ugorji commented Oct 11, 2017

Broken Decoding of nil map value in map #208

Broken Decoding of nil map value in map #208

Comments

ajwerner commented Oct 5, 2017

ajwerner commented Oct 5, 2017

ugorji commented Oct 11, 2017