Skip to content

Commit

Permalink
feat(examples): add package cford32, add method seqid.ID.String (#1572
Browse files Browse the repository at this point in the history
)

This PR adds a new package to examples, `cford32`, meant primarily to be
used in package `seqid` as an AVL- and human-friendly ID, which
implements an encoding scheme with the base32 encoding scheme [specified
by Douglas Crockford](https://www.crockford.com/base32.html). It
additionally implements a `uint64` encoding scheme I created, which
encodes "tiny" (< 17 billion) values as 7-byte strings, and can encode
the full `uint64` range with 13 bytes.

The package is largely a fork of Go's `encoding/base32`, intentionally
forked to have a very familiar API, while needing to be forked to
implement some distinctive features of the encoding (like case
insensitivity, and mapping in decoding all of the symbols `l L i I 1` to
the same value).

The necessity of this package comes from a solution that I implemented
in GnoChess:


https://github.com/gnolang/gnochess/blob/9aa813fbb86fec377a85fc4528411d652fc780ff/realm/chess.gno#L286-L295

Essentially, GnoChess used simple sequential IDs for its saved entities
(like games). To work well with AVL's sorted keys, it padded the
generated strings to the left with up to 9 zeroes. This, of course,
breaks for values `>= 1e10` (10 billion), as `("2" + "000000000") >
("10" + "000000000")`.
  • Loading branch information
thehowl authored Feb 22, 2024
1 parent 0b39151 commit 16c7c2e
Show file tree
Hide file tree
Showing 9 changed files with 1,501 additions and 5 deletions.
27 changes: 27 additions & 0 deletions examples/gno.land/p/demo/cford32/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
Copyright (c) 2009 The Go Authors. All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
* Neither the name of Google Inc. nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
76 changes: 76 additions & 0 deletions examples/gno.land/p/demo/cford32/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# cford32

```
package cford32 // import "gno.land/p/demo/cford32"
Package cford32 implements a base32-like encoding/decoding package, with the
encoding scheme specified by Douglas Crockford.
From the website, the requirements of said encoding scheme are to:
- Be human readable and machine readable.
- Be compact. Humans have difficulty in manipulating long strings of arbitrary
symbols.
- Be error resistant. Entering the symbols must not require keyboarding
gymnastics.
- Be pronounceable. Humans should be able to accurately transmit the symbols
to other humans using a telephone.
This is slightly different from a simple difference in encoding table from
the Go's stdlib `encoding/base32`, as when decoding the characters i I l L are
parsed as 1, and o O is parsed as 0.
This package additionally provides ways to encode uint64's efficiently, as well
as efficient encoding to a lowercase variation of the encoding. The encodings
never use paddings.
# Uint64 Encoding
Aside from lower/uppercase encoding, there is a compact encoding, allowing to
encode all values in [0,2^34), and the full encoding, allowing all values in
[0,2^64). The compact encoding uses 7 characters, and the full encoding uses 13
characters. Both are parsed unambiguously by the Uint64 decoder.
The compact encodings have the first character between ['0','f'], while the
full encoding's first character ranges between ['g','z']. Practically, in your
usage of the package, you should consider which one to use and stick with it,
while considering that the compact encoding, once it reaches 2^34, automatically
switches to the full encoding. The properties of the generated strings are still
maintained: for instance, any two encoded uint64s x,y consistently generated
with the compact encoding, if the numeric value is x < y, will also be x < y in
lexical ordering. However, values [0,2^34) have a "double encoding", which if
mixed together lose the lexical ordering property.
The Uint64 encoding is most useful for generating string versions of Uint64 IDs.
Practically, it allows you to retain sleek and compact IDs for your applcation
for the first 2^34 (>17 billion) entities, while seamlessly rolling over to the
full encoding should you exceed that. You are encouraged to use it unless you
have a requirement or preferences for IDs consistently being always the same
size.
To use the cford32 encoding for IDs, you may want to consider using package
gno.land/p/demo/seqid.
[specified by Douglas Crockford]: https://www.crockford.com/base32.html
func AppendCompact(id uint64, b []byte) []byte
func AppendDecode(dst, src []byte) ([]byte, error)
func AppendEncode(dst, src []byte) []byte
func AppendEncodeLower(dst, src []byte) []byte
func Decode(dst, src []byte) (n int, err error)
func DecodeString(s string) ([]byte, error)
func DecodedLen(n int) int
func Encode(dst, src []byte)
func EncodeLower(dst, src []byte)
func EncodeToString(src []byte) string
func EncodeToStringLower(src []byte) string
func EncodedLen(n int) int
func NewDecoder(r io.Reader) io.Reader
func NewEncoder(w io.Writer) io.WriteCloser
func NewEncoderLower(w io.Writer) io.WriteCloser
func PutCompact(id uint64) []byte
func PutUint64(id uint64) [13]byte
func PutUint64Lower(id uint64) [13]byte
func Uint64(b []byte) (uint64, error)
type CorruptInputError int64
```
Loading

0 comments on commit 16c7c2e

Please sign in to comment.