codec: Multiple fixes, enhancements and test expansion across the board.

Details below ... Expanded Tests across the board, including but not limited to: - non-string keys - chan - swallow (decode into equivalent of /dev/null) - zero - raw ext - map keys that are structs - canonical - decoding nil map value - ... Fixed issues: - expand set of options used in testing - use bitset(256|128) instead of a large array to test whether bits are set - allow decodeValue intelligently handle whether a function requires a pointer or not. This way, folks don't have to call Addr themselves. - use bounded arrays for the cache used during decodeNaked - decNaked: use an array of reflect.Value to pre-load all decNaked stuff - everyone calls into encodeValue or decodeValue (removing all cruft) - support recognizedRtid behind a flag - separate setting a value to its zero value, from decode. - remove many helpers (e.g. predecodevalue): now just a few methods do the right things - cleanup fast path: no tryNil argument needed, and the arg is either a pointer or a base value. - support arrays well in codecgen - full support for io.Reader passed to decode - msgpack: full support for strings read from a []byte codec: collection of correctness and simplification changes Move decNaked out of the Decoder struct. The Decoder struct was too big, and decNaked is only used if decoding into an empty interface{}. So make it an explicit request to grab a decNaked if doing DecodeNaked. Use a sync.Pool so that the decNaked's can be shared across multiple decodings. ---- Add SliceElementReset as a decode option, to align with MapValueReset and InterfaceReset. This eliminates the resetSliceElemToZeroValue constant. ---- Clean out comments and commented code blocks. ---- Move the code for getting a codec function into a single place. Previously, there was much duplication of the same logic in encode.go and decode.go. Now, we put a single type "codecFner" that is a member element of Decoder and Encoder. This means that getEncFn and getDecFn go away, and fast-path looks different, because the definition of encfn and decfn have changed. This will allow us move the code to be a member of the Handle later on, which should be where it belongs. ---- Use a single pool struct to consolidate how pools are used in the codebase, for decNaked, encStruct, etc. ---- Reduce bounds-checking in json.go, by looking up an index in the string just once, and setting into a char variable. Helped shave much time from json decoding. ---- codec: add error field to (De|En)coder to flag such as unusable until reset json: differentiate empty vs nil slice when encoding/decoding clean up encoding of chans codecgen: add support for NoExtensions Extensions support causes the receiver of CodecDecodeSelf to escape, which can have dramatic performance implications. Since extensions is a seldom used feature, it is not fair for everyone to pay the price. add String method to valueType, for debugging msgpack: support DecodeBytes where valueTypeBytes comes from a msgpack string container cbor: fix extension encoding - only encode the Value component of RawExt support new Decode Option: DeleteOnNilMapValue - DeleteOnNilMapValue controls how to decode a nil value in the stream. - add a test Ensure T or ... *****T are recognized as Selfer during encode or decode - to do this, always pass checkSelfer=true when getting the function for encode/decode separare values_test.go from values_flex_test.go Currently, we use the same file (values_test.go) for testing and benchmarks. However, our benchmarks compare codec against other (en|de)coders which are stricter and may support only maps with string keys. To support these 2 modes elegantly, we separate value_test.go, to be used by benchmarks, and values_flex_test.go to be used by codec tests. values_test.go will now be used for testing and benchmarks, but values_flex_test.go will only be used for testing. Fix: ensure we encode separator before encoding nil in struct2array mode of encode struct test: expand TestStruc to have fields of struct type with no omitempty. This makes the tests for omitempty more expansive, as the code takes 2 different paths if there are omitempty fields or not. optimize structFieldInfo, struct field lookup and (en|de)coding with no omitempty - Use a fixed array for the structFieldInfo instead of a slice. It uses same amount of memory (roughly) while reducing the array bounds checking. Due to this, we no longer support embedding over 16 levels deep. In practice, this is not a concern. - Also, introduct StructFieldNode which allows us use a cache when iterating a Struct during encoding and decoding, so we don't lookup Value.Field, Value.Elem and Value.IsNil repeatedly. - Determine whether a typeInfo has any omitEmpty fields early, so we can take a fast-path if we no omitEmpty is not a concern. add faster versions of readn3,4 and writen4,5 for use by json codec kInterface MUST take a settable interface value. ensure kMap honors that. decode: can decode into non-nil chan/slice/map/ptr or array, not just non-nil ptr
ugorji · Oct 11, 2017 · 50530d8 · 50530d8
1 parent 54210f4
commit 50530d8
Show file tree

Hide file tree

Showing 30 changed files with 11,965 additions and 12,275 deletions.
diff --git a/codec/0doc.go b/codec/0doc.go
@@ -31,8 +31,13 @@ the standard library (ie json, xml, gob, etc).
 Rich Feature Set includes:
 
   - Simple but extremely powerful and feature-rich API
+  - Support for go1.4 and above, while selectively using newer APIs for later releases
+  - Good code coverage ( > 70% )
   - Very High Performance.
     Our extensive benchmarks show us outperforming Gob, Json, Bson, etc by 2-4X.
+  - Careful selected use of 'unsafe' for targeted performance gains.
+    100% mode exists where 'unsafe' is not used at all.
+  - Lock-free (sans mutex) concurrency for scaling to 100's of cores
   - Multiple conversions:
     Package coerces types where appropriate
     e.g. decode an int in the stream into a float, etc.
@@ -156,40 +161,25 @@ Sample usage model:
     //OR rpcCodec := codec.MsgpackSpecRpc.ClientCodec(conn, h)
     client := rpc.NewClientWithCodec(rpcCodec)
 
+Running Tests
+
+To run tests, use the following:
+
+    go test
+
+To run the full suite of tests, use the following:
+
+    go test -tags alltests -run Suite
+
+You can run the tag 'safe' to run tests or build in safe mode. e.g.
+
+    go test -tags safe -run Json
+    go test -tags "alltests safe" -run Suite
+
+Running Benchmarks
+
+Please see http://github.com/ugorji/go-codec-bench .
+
 */
 package codec
 
-// Benefits of go-codec:
-//
-//    - encoding/json always reads whole file into memory first.
-//      This makes it unsuitable for parsing very large files.
-//    - encoding/xml cannot parse into a map[string]interface{}
-//      I found this out on reading https://github.com/clbanning/mxj
-
-// TODO:
-//
-//   - optimization for codecgen:
-//     if len of entity is <= 3 words, then support a value receiver for encode.
-//   - (En|De)coder should store an error when it occurs.
-//     Until reset, subsequent calls return that error that was stored.
-//     This means that free panics must go away.
-//     All errors must be raised through errorf method.
-//   - Decoding using a chan is good, but incurs concurrency costs.
-//     This is because there's no fast way to use a channel without it
-//     having to switch goroutines constantly.
-//     Callback pattern is still the best. Maybe consider supporting something like:
-//        type X struct {
-//             Name string
-//             Ys []Y
-//             Ys chan <- Y
-//             Ys func(Y) -> call this function for each entry
-//        }
-//    - Consider adding a isZeroer interface { isZero() bool }
-//      It is used within isEmpty, for omitEmpty support.
-//    - Consider making Handle used AS-IS within the encoding/decoding session.
-//      This means that we don't cache Handle information within the (En|De)coder,
-//      except we really need it at Reset(...)
-//    - Consider adding math/big support
-//    - Consider reducing the size of the generated functions:
-//      Maybe use one loop, and put the conditionals in the loop.
-//      for ... { if cLen > 0 { if j == cLen { break } } else if dd.CheckBreak() { break } }
diff --git a/codec/binc.go b/codec/binc.go
@@ -512,7 +512,7 @@ func (d *bincDecDriver) DecodeInt(bitsize uint8) (i int64) {
 		i = -i
 	}
 	if chkOvf.Int(i, bitsize) {
-		d.d.errorf("binc: overflow integer: %v", i)
+		d.d.errorf("binc: overflow integer: %v for num bits: %v", i, bitsize)
 		return
 	}
 	d.bdRead = false
@@ -804,7 +804,7 @@ func (d *bincDecDriver) DecodeNaked() {
 		d.readNextBd()
 	}
 
-	n := &d.d.n
+	n := d.d.n
 	var decodeFurther bool
 
 	switch d.vd {
@@ -923,6 +923,10 @@ func (h *BincHandle) newDecDriver(d *Decoder) decDriver {
 	return &bincDecDriver{d: d, h: h, r: d.r, br: d.bytes}
 }
 
+func (_ *BincHandle) IsBuiltinType(rt uintptr) bool {
+	return rt == timeTypId
+}
+
 func (e *bincEncDriver) reset() {
 	e.w = e.e.w
 	e.s = 0

diff --git a/codec/cbor.go b/codec/cbor.go
@@ -134,12 +134,12 @@ func (e *cborEncDriver) EncodeExt(rv interface{}, xtag uint64, ext Ext, en *Enco
 
 func (e *cborEncDriver) EncodeRawExt(re *RawExt, en *Encoder) {
 	e.encUint(uint64(re.Tag), cborBaseTag)
-	if re.Data != nil {
+	if false && re.Data != nil {
 		en.encode(re.Data)
-	} else if re.Value == nil {
-		e.EncodeNil()
-	} else {
+	} else if re.Value != nil {
 		en.encode(re.Value)
+	} else {
+		e.EncodeNil()
 	}
 }
 
@@ -469,7 +469,7 @@ func (d *cborDecDriver) DecodeNaked() {
 		d.readNextBd()
 	}
 
-	n := &d.d.n
+	n := d.d.n
 	var decodeFurther bool
 
 	switch d.bd {