Skip to content

Commit

Permalink
codec: Multiple fixes, enhancements and test expansion across the board.
Browse files Browse the repository at this point in the history
Details below ...

Expanded Tests across the board, including but not limited to:
    - non-string keys
    - chan
    - swallow (decode into equivalent of /dev/null)
    - zero
    - raw ext
    - map keys that are structs
    - canonical
    - decoding nil map value
    - ...

Fixed issues:
    - expand set of options used in testing
    - use bitset(256|128) instead of a large array to test whether bits are set
    - allow decodeValue intelligently handle whether a function requires a pointer or not.
      This way, folks don't have to call Addr themselves.
    - use bounded arrays for the cache used during decodeNaked
    - decNaked: use an array of reflect.Value to pre-load all decNaked stuff
    - everyone calls into encodeValue or decodeValue (removing all cruft)
    - support recognizedRtid behind a flag
    - separate setting a value to its zero value, from decode.
    - remove many helpers (e.g. predecodevalue): now just a few methods do the right things
    - cleanup fast path: no tryNil argument needed, and the arg is either a pointer or a base value.
    - support arrays well in codecgen
    - full support for io.Reader passed to decode
    - msgpack: full support for strings read from a []byte

 codec: collection of correctness and simplification changes

    Move decNaked out of the Decoder struct. The Decoder struct was too big, and decNaked is only used
    if decoding into an empty interface{}. So make it an explicit request to grab a decNaked if doing DecodeNaked.

    Use a sync.Pool so that the decNaked's can be shared across multiple decodings.
    ----
    Add SliceElementReset as a decode option, to align with MapValueReset and InterfaceReset.

    This eliminates the resetSliceElemToZeroValue constant.
    ----
    Clean out comments and commented code blocks.
    ----
    Move the code for getting a codec function into a single place. Previously, there was much
    duplication of the same logic in encode.go and decode.go. Now, we put a single type "codecFner"
    that is a member element of Decoder and Encoder.

    This means that getEncFn and getDecFn go away, and fast-path looks different, because the definition
    of encfn and decfn have changed.

    This will allow us move the code to be a member of the Handle later on, which should be where it
    belongs.
    ----
    Use a single pool struct to consolidate how pools are used in the codebase,
    for decNaked, encStruct, etc.
    ----
    Reduce bounds-checking in json.go, by looking up an index in the string just once,
    and setting into a char variable. Helped shave much time from json decoding.
    ----

codec: add error field to (De|En)coder to flag such as unusable until reset

json: differentiate empty vs nil slice when encoding/decoding

clean up encoding of chans

codecgen: add support for NoExtensions

    Extensions support causes the receiver of CodecDecodeSelf to escape,
    which can have dramatic performance implications.

    Since extensions is a seldom used feature, it is not fair for everyone
    to pay the price.

add String method to valueType, for debugging

msgpack: support DecodeBytes where valueTypeBytes comes from a msgpack string container

cbor: fix extension encoding - only encode the Value component of RawExt

support new Decode Option: DeleteOnNilMapValue
    - DeleteOnNilMapValue controls how to decode a nil value in the stream.
    - add a test

Ensure T or ... *****T are recognized as Selfer during encode or decode
    - to do this, always pass checkSelfer=true when getting the function for encode/decode

separare values_test.go from values_flex_test.go

    Currently, we use the same file (values_test.go) for testing and benchmarks.

    However, our benchmarks compare codec against other (en|de)coders which are
    stricter and may support only maps with string keys.

    To support these 2 modes elegantly, we separate value_test.go, to be used by
    benchmarks, and values_flex_test.go to be used by codec tests.

    values_test.go will now be used for testing and benchmarks, but values_flex_test.go
    will only be used for testing.

Fix: ensure we encode separator before encoding nil in struct2array mode of encode struct

test: expand TestStruc to have fields of struct type with no omitempty.

    This makes the tests for omitempty more expansive, as the code takes 2 different
    paths if there are omitempty fields or not.

optimize structFieldInfo, struct field lookup and (en|de)coding with no omitempty

    - Use a fixed array for the structFieldInfo instead of a slice.
      It uses same amount of memory (roughly) while reducing the array bounds checking.
      Due to this, we no longer support embedding over 16 levels deep.
      In practice, this is not a concern.
    - Also, introduct StructFieldNode which allows us use a cache
      when iterating a Struct during encoding and decoding,
      so we don't lookup Value.Field, Value.Elem and Value.IsNil repeatedly.
    - Determine whether a typeInfo has any omitEmpty fields early, so we
      can take a fast-path if we no omitEmpty is not a concern.

add faster versions of readn3,4 and writen4,5 for use by json codec

kInterface MUST take a settable interface value. ensure kMap honors that.

decode: can decode into non-nil chan/slice/map/ptr or array, not just non-nil ptr
  • Loading branch information
ugorji committed Oct 11, 2017
1 parent 54210f4 commit 50530d8
Show file tree
Hide file tree
Showing 30 changed files with 11,965 additions and 12,275 deletions.
58 changes: 24 additions & 34 deletions codec/0doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,13 @@ the standard library (ie json, xml, gob, etc).
Rich Feature Set includes:
- Simple but extremely powerful and feature-rich API
- Support for go1.4 and above, while selectively using newer APIs for later releases
- Good code coverage ( > 70% )
- Very High Performance.
Our extensive benchmarks show us outperforming Gob, Json, Bson, etc by 2-4X.
- Careful selected use of 'unsafe' for targeted performance gains.
100% mode exists where 'unsafe' is not used at all.
- Lock-free (sans mutex) concurrency for scaling to 100's of cores
- Multiple conversions:
Package coerces types where appropriate
e.g. decode an int in the stream into a float, etc.
Expand Down Expand Up @@ -156,40 +161,25 @@ Sample usage model:
//OR rpcCodec := codec.MsgpackSpecRpc.ClientCodec(conn, h)
client := rpc.NewClientWithCodec(rpcCodec)
Running Tests
To run tests, use the following:
go test
To run the full suite of tests, use the following:
go test -tags alltests -run Suite
You can run the tag 'safe' to run tests or build in safe mode. e.g.
go test -tags safe -run Json
go test -tags "alltests safe" -run Suite
Running Benchmarks
Please see http://github.com/ugorji/go-codec-bench .
*/
package codec

// Benefits of go-codec:
//
// - encoding/json always reads whole file into memory first.
// This makes it unsuitable for parsing very large files.
// - encoding/xml cannot parse into a map[string]interface{}
// I found this out on reading https://github.com/clbanning/mxj

// TODO:
//
// - optimization for codecgen:
// if len of entity is <= 3 words, then support a value receiver for encode.
// - (En|De)coder should store an error when it occurs.
// Until reset, subsequent calls return that error that was stored.
// This means that free panics must go away.
// All errors must be raised through errorf method.
// - Decoding using a chan is good, but incurs concurrency costs.
// This is because there's no fast way to use a channel without it
// having to switch goroutines constantly.
// Callback pattern is still the best. Maybe consider supporting something like:
// type X struct {
// Name string
// Ys []Y
// Ys chan <- Y
// Ys func(Y) -> call this function for each entry
// }
// - Consider adding a isZeroer interface { isZero() bool }
// It is used within isEmpty, for omitEmpty support.
// - Consider making Handle used AS-IS within the encoding/decoding session.
// This means that we don't cache Handle information within the (En|De)coder,
// except we really need it at Reset(...)
// - Consider adding math/big support
// - Consider reducing the size of the generated functions:
// Maybe use one loop, and put the conditionals in the loop.
// for ... { if cLen > 0 { if j == cLen { break } } else if dd.CheckBreak() { break } }
8 changes: 6 additions & 2 deletions codec/binc.go
Original file line number Diff line number Diff line change
Expand Up @@ -512,7 +512,7 @@ func (d *bincDecDriver) DecodeInt(bitsize uint8) (i int64) {
i = -i
}
if chkOvf.Int(i, bitsize) {
d.d.errorf("binc: overflow integer: %v", i)
d.d.errorf("binc: overflow integer: %v for num bits: %v", i, bitsize)
return
}
d.bdRead = false
Expand Down Expand Up @@ -804,7 +804,7 @@ func (d *bincDecDriver) DecodeNaked() {
d.readNextBd()
}

n := &d.d.n
n := d.d.n
var decodeFurther bool

switch d.vd {
Expand Down Expand Up @@ -923,6 +923,10 @@ func (h *BincHandle) newDecDriver(d *Decoder) decDriver {
return &bincDecDriver{d: d, h: h, r: d.r, br: d.bytes}
}

func (_ *BincHandle) IsBuiltinType(rt uintptr) bool {
return rt == timeTypId
}

func (e *bincEncDriver) reset() {
e.w = e.e.w
e.s = 0
Expand Down
10 changes: 5 additions & 5 deletions codec/cbor.go
Original file line number Diff line number Diff line change
Expand Up @@ -134,12 +134,12 @@ func (e *cborEncDriver) EncodeExt(rv interface{}, xtag uint64, ext Ext, en *Enco

func (e *cborEncDriver) EncodeRawExt(re *RawExt, en *Encoder) {
e.encUint(uint64(re.Tag), cborBaseTag)
if re.Data != nil {
if false && re.Data != nil {
en.encode(re.Data)
} else if re.Value == nil {
e.EncodeNil()
} else {
} else if re.Value != nil {
en.encode(re.Value)
} else {
e.EncodeNil()
}
}

Expand Down Expand Up @@ -469,7 +469,7 @@ func (d *cborDecDriver) DecodeNaked() {
d.readNextBd()
}

n := &d.d.n
n := d.d.n
var decodeFurther bool

switch d.bd {
Expand Down
Loading

0 comments on commit 50530d8

Please sign in to comment.