mus-go is a MUS format serializer. However, due to its minimalist design and a wide range of serialization primitives, it can also be used to implement other binary serialization formats (here is an example where mus-go is utilized to implement Protobuf encoding).
To get started quickly, go to the code generator page.
It is lightning fast, space efficient and well tested.
- Has a streaming version.
- Can run on both 32 and 64-bit systems.
- Variable-length data types (like
string
,slice
, ormap
) are encoded as:length + data
. You can choose binary representation for both of these parts. - Supports data versioning.
- Deserialization may fail with one of the following errors:
ErrOverflow
,ErrNegativeLength
,ErrTooSmallByteSlice
,ErrWrongFormat
. - Can validate and skip data while unmarshalling.
- Supports pointers.
- Can encode data structures such as graphs or linked lists.
- Supports oneof feature.
- Supports private fields.
- Supports out-of-order deserialization.
- Supports zero allocation deserialization.
- mus-go Serializer
- Contents
- cmd-stream-go
- musgen-go
- Benchmarks
- How To Use
- Structs Support
- Arrays Support
- MarshallerMUS Interface
- Generic MarshalMUS Function
- DTM (Data Type Metadata) Support
- Data Versioning
- Marshal/Unmarshal Interfaces (or oneof feature)
- Validation
- Out of Order Deserialization
- Zero Allocation Deserialization
cmd-stream-go allows you to execute commands on the server. cmd-stream-go/MUS is about 3 times faster than gRPC/Protobuf.
Writing mus-go code manually can be tedious and error-prone. Instead, it’s much
better to use a code generator that
automatically produces Marshal
, Unmarshal
, Size
, and Skip
functions for
you. It’s also incredibly easy to use - simply provide a type and call
Generate()
.
- github.com/ymz-ncnk/go-serialization-benchmarks - contains the results of running serializers in different modes.
- github.com/alecthomas/go_serialization_benchmarks
Why did I create another benchmarks?
The existing benchmarks
have some notable issues - try running them several times, and you’ll likely get
inconsistent results, making it difficult to determine which serializer is truly
faster. That was one of the reasons, and basically I made them for my own use.
Don't forget to visit mus-examples-go.
mus-go offers several encoding options, each of which is in a separate package.
Serializes all uint
(uint64
, uint32
, uint16
, uint8
, uint
), int
,
float
, byte
data types using Varint encoding. For example:
package main
import "github.com/mus-format/mus-go/varint"
func main() {
var (
num = 1000
size = varint.SizeInt(num) // The number of bytes required to serialize num.
bs = make([]byte, size)
)
n := varint.MarshalInt(num, bs) // Returns the number of used bytes.
num, n, err := varint.UnmarshalInt(bs) // In addition to the int value and the
// number of used bytes, it may also return an error.
// ...
}
Serializes the same uint
, int
, float
, byte
data types using Raw
encoding. For example:
package main
import "github.com/mus-format/mus-go/raw"
func main() {
var (
num = 1000
size = raw.SizeInt(num)
bs = make([]byte, size)
)
n := raw.MarshalInt(num, bs)
num, n, err := raw.UnmarshalInt(bs)
// ...
}
More details about Varint and Raw encodings can be found in the MUS format specification. If in doubt, use Varint.
Supports the following data types: bool
, string
, slice
, byte slice
,
map
, and pointers.
Variable-length data types (such as string
, slice
, or map
) are encoded as
length + data
. You can select the binary representation for both these parts.
By default, the length is encoded using Varint without ZigZag (...PositiveInt()
functions from the varint package). In this case the maximum length is limited
by the maximum value of the int
type on your system. This is ok for different
architectures - an attempt to unmarshal, for example, too long string on a
32-bit system, will result in ErrOverflow
.
Let's examine how the slice type is serialized:
package main
import (
"github.com/mus-format/mus-go"
"github.com/mus-format/mus-go/ord"
"github.com/mus-format/mus-go/varint"
)
func main() {
var (
sl = []int{1, 2, 3, 4, 5}
lenM mus.Marshaller // Length marshaller, if nil varint.MarshalPositiveInt() is used.
lenU mus.Unmarshaller // Length unmarshaller, if nil varint.UnmarshalPositiveInt() is used.
m mus.MarshallerFn[int] = varint.MarshalInt // Implementation of the
// mus.Marshaller interface for slice elements.
u mus.UnmarshallerFn[int] = varint.UnmarshalInt // Implementation of the
// mus.Unmarshaller interface for slice elements.
s mus.SizerFn[int] = varint.SizeInt // Implementation of the mus.Sizer
// interface for slice elements.
size = ord.SizeSlice[int](sl, s)
bs = make([]byte, size)
)
n := ord.MarshalSlice[int](sl, lenM, m, bs)
sl, n, err := ord.UnmarshalSlice[int](lenU, u, bs)
// ...
}
Maps are serialized in the same way.
unsafe package provides maximum performance, but be careful it uses an unsafe type conversion.
This warning largely applies to the string type - modifying the byte slice after unmarshalling will also change the string’s contents. Here is an example that demonstrates this behavior more clearly.
Supports the following data types: bool
, string
, byte slice
, byte
, and
all uint
, int
, float
.
Let's consider the following struct:
type TwoPtr struct {
ptr1 *string
ptr2 *string
}
With the ord
package after unmarshal twoPtr.ptr1 != twoPtr.ptr2
, and with
the pm
package, they will be equal. This feature allows to serialize data
structures such as graphs or linked lists. Corresponding examples can be found
at mus-examples-go.
In fact, mus-go does not support structural data types, which means that you will
have to implement the mus.Marshaller
, mus.Unmarshaller
, mus.Sizer
and
mus.Skipper
interfaces yourself. But it's not difficult at all, for example:
package main
import (
"github.com/mus-format/mus-go/ord"
"github.com/mus-format/mus-go/varint"
)
type Foo struct {
a int
b bool
c string
}
// MarshalFoo implements the mus.Marshaller interface.
func MarshalFoo(v Foo, bs []byte) (n int) {
n = varint.MarshalInt(v.a, bs)
n += ord.MarshalBool(v.b, bs[n:])
return n + ord.MarshalString(v.c, nil, bs[n:])
}
// UnmarshalFoo implements the mus.Unmarshaller interface.
func UnmarshalFoo(bs []byte) (v Foo, n int, err error) {
v.a, n, err = varint.UnmarshalInt(bs)
if err != nil {
return
}
var n1 int
v.b, n1, err = ord.UnmarshalBool(bs[n:])
n += n1
if err != nil {
return
}
v.c, n1, err = ord.UnmarshalString(nil, bs[n:])
n += n1
return
}
// SizeFoo implements the mus.Sizer interface.
func SizeFoo(v Foo) (size int) {
size += varint.SizeInt(v.a)
size += ord.SizeBool(v.b)
return size + ord.SizeString(v.c, nil)
}
// SkipFoo implements the mus.Skipper interface.
func SkipFoo(bs []byte) (n int, err error) {
n, err = varint.SkipInt(bs)
if err != nil {
return
}
var n1 int
n1, err = ord.SkipBool(bs[n:])
n += n1
if err != nil {
return
}
n1, err = ord.SkipString(nil, bs[n:])
n += n1
return
}
All you have to do is deconstruct the structure into simpler data types and choose the desired encoding for each. Of course, this requires some effort. But, firstly, this code can be generated, secondly, this approach provides more flexibility, and thirdly, mus-go remains quite simple, which makes it easy to implement for other programming languages.
Unfortunately, Golang does not support generic parameterization of array sizes.
Therefore, to serialize an array - make a slice of it and use the ord
package.
Or, for better performance, implement the necessary Marshal
, Unmarshal
, ...
functions, as done in the ord/slice.go file.
It is often convenient to define the MarshallerMUS
interface:
type MarshallerMUS interface {
MarshalMUS(bs []byte) (n int)
SizeMUS() (size int)
}
// Foo implements the MarshallerMUS interface.
type Foo struct {...}
func (f Foo) MarshalMUS(bs []byte) (n int) {
return MarshalFooMUS(f, bs) // or FooDTS.Marshal(f, bs)
}
func (f Foo) SizeMUS() (size int) {
return SizeFooMUS(f) // or FooDTS.Size(f)
}
...
To define generic MarshalMUS
function:
package main
// Define MarshallerMUS interface ...
type MarshallerMUS interface {
MarshalMUS(bs []byte) (n int)
SizeMUS() (size int)
}
// ... and the function itself.
func MarshalMUS(v MarshallerMUS) (bs []byte) {
bs = make([]byte, v.SizeMUS())
v.MarshalMUS(bs)
return
}
// Define a structure that implements the MarshallerMUS interface.
type Foo struct {...}
...
func main() {
// Now the generic MarshalMUS function can be used like this.
bs := MarshalMUS(Foo{...})
// ...
}
The full code can be found here.
mus-dts-go provides DTM support.
mus-dts-go can be used to implement data versioning. Here is an example.
You should read the mus-dts-go documentation first.
A simple example:
// Interface to Marshal/Unmarshal.
type Instruction interface {...}
// Copy implements the Instruction and MarshallerMUS interfaces.
type Copy struct {...}
// Insert implements the Instruction and MarshallerMUS interfaces.
type Insert struct {...}
var (
CopyDTS = ...
InsertDTS = ...
)
func MarshalInstruction(instr Instruction, bs []byte) (n int) {
if m, ok := instr.(MarshallerMUS); ok {
return m.MarshalMUS(bs)
}
panic("instr doesn't implement the MarshallerMUS interface")
}
func UnmarshalInstruction(bs []byte) (instr Instruction, n int, err error) {
dtm, n, err := dts.UnmarshalDTM(bs)
if err != nil {
return
}
switch dtm {
case CopyDTM:
return CopyDTS.UnmarshalData(bs[n:])
case InsertDTM:
return InsertDTS.UnmarshalData(bs[n:])
default:
err = ErrUnexpectedDTM
return
}
}
func SizeInstruction(instr Instruction) (size int) {
if s, ok := instr.(MarshallerMUS); ok {
return s.SizeMUS()
}
panic("instr doesn't implement the MarshallerMUS interface")
}
A full example can be found at mus-examples-go. Take a note, nothing will stop us to Marshal/Unmarshal, for example, a slice of interfaces.
Validation is performed during deserialization.
With the ord.UnmarshalValidString()
function, you can validate the length of
a string.
package main
import (
"errors"
com "github.com/mus-format/common-go"
"github.com/mus-format/mus-go/ord"
)
func main() {
var (
ErrTooLongString = errors.New("too long string")
lenU mus.Unmarshaller // Length unmarshaller, if nil varint.UnmarshalPositiveInt() is used.
lenVl com.ValidatorFn[int] = func(length int) (err error) { // Length validator.
if length > 10 {
err = ErrTooLongString
}
return
}
skip = true // If true and the encoded string does not meet the requirements
// of the validator, all bytes belonging to it will be skipped - n will be
// equal to SizeString(str).
)
// ...
str, n, err := ord.UnmarshalValidString(lenU, lenVl, skip, bs)
// ...
}
With the ord.UnmarshalValidSlice()
function, you can validate the length and
elements of a slice. Also it provides an option to skip the rest of the data
if one of the validators returns an error.
package main
import (
"errors"
com "github.com/mus-format/common-go"
"github.com/mus-format/mus-go"
"github.com/mus-format/mus-go/ord"
"github.com/mus-format/mus-go/varint"
)
func main() {
var (
ErrTooLongSlice = errors.New("too long slice")
ErrTooBigSliceElem = errors.New("too big slice elem")
lenU mus.Unmarshaller // Length unmarshaller, if nil varint.UnmarshalPositiveInt() is used.
lenVl com.ValidatorFn[int] = func(length int) (err error) { // Length validator.
if length > 5 {
err = ErrTooLongSlice
}
return
}
u = mus.UnmarshallerFn[int](varint.UnmarshalInt)
vl com.ValidatorFn[int] = func(e int) (err error) { // Elements validator.
if e > 10 {
err = ErrTooBigSliceElem
}
return
}
sk = mus.SkipperFn(varint.SkipInt) // If nil, a validation
// error will be returned immediately. If != nil and one of the validators
// returns an error, will be used to skip the rest of the slice.
)
// ...
sl, n, err := ord.UnmarshalValidSlice[int](lenU, lenVl, u, vl, sk, bs)
// ...
}
Validation works in the same way as for the slice type.
Unmarshalling an invalid structure can stop at the first invalid field with a validation error.
package main
import (
"errors"
com "github.com/mus-format/common-go"
"github.com/mus-format/mus-go/ord"
"github.com/mus-format/mus-go/varint"
)
func UnmarshalValidFoo(vl com.Validator[int], bs []byte) (v Foo, n int, err error) {
// Unmarshal the first field.
v.a, n, err = varint.UnmarshalInt(bs)
if err != nil {
return
}
// Validate the first field. There is no need to deserialize the entire
// structure to find out that it is invalid.
if err = vl.Validate(v.a); err != nil {
err = fmt.Errorf("incorrect field 'a': %w", err)
return // The rest of the structure remains unmarshaled.
}
// ...
}
// vl can be used to check Foo.a field.
var vl com.ValidatorFn[int] = func(n int) (err error) {
if n > 10 {
return errors.New("bigger than 10")
}
return
}
A simple example:
package main
import (
"fmt"
"github.com/mus-format/mus-go/varint"
)
func main() {
// Encode three numbers in turn - 5, 10, 15.
bs := make([]byte, varint.SizeInt(5)+varint.SizeInt(10)+varint.SizeInt(15))
n1 := varint.MarshalInt(5, bs)
n2 := varint.MarshalInt(10, bs[n1:])
varint.MarshalInt(15, bs[n1+n2:])
// Get them back in the opposite direction. Errors are omitted for simplicity.
n1, _ = varint.SkipInt(bs)
n2, _ = varint.SkipInt(bs)
num, _, _ := varint.UnmarshalInt(bs[n1+n2:])
fmt.Println(num)
num, _, _ = varint.UnmarshalInt(bs[n1:])
fmt.Println(num)
num, _, _ = varint.UnmarshalInt(bs)
fmt.Println(num)
// The output will be:
// 15
// 10
// 5
}
Can be achieved using the unsafe package.