Arbitrary alphabet and radix support #2

anitgandhi · 2017-06-29T22:09:46Z

Description of Enhancement

Expanding on #1 , in theory, this package should support arbitrary alphabets as well. It will require either major changes to math/big or a new sub-package.

This may or may not have to include non-ASCII characters.

Further, there is a chance it will require a breaking change in the interface, if this alphabet input is added to NewCipher. Alternatively, a new function on Cipher called SetAlphabet could be added.

func NewCipher(radix int, maxTLen int, key []byte, tweak []byte) (Cipher, error) {
	...
	// defaults
    f.alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
    f.radix = 62
}

func (f Cipher) SetAlphabet(alphabet string, radix int) {
	// error checking on alphabet and radix
	// ex: alphabet can't be "12345" and radix be 9
	...

    f.alphabet = alphabet
    f.radix = radix
}

The text was updated successfully, but these errors were encountered:

fmpwizard · 2017-07-17T03:41:36Z

Two questions:

Did you already submit/propose the changes to the Go project? (have a link?)
I have been looking for something like this for a while now, but the string I have to encrypt, besides having 0-9, a-z, A-Z, also could have things like / % & and a few other characters, would they fit into the idea behind this ticket of alternative alphabets ?

Thanks a lot for making this project public!

anitgandhi · 2017-07-17T13:47:13Z

I haven't yet, thanks for the reminder 👍
Yup the arbitrary alphabet support would be perfect for things like / % &. I haven't gotten a chance to take a deeper dive into how it would work in detail; in one sense it's easiest to convince Golang maintainers to add this support, but on the flip side that would require the newest version of Golang, rather than being self-contained.

fmpwizard · 2017-07-17T14:08:38Z

It would be great if the change wouldn't require using the latest Go version (specially using tip), but have that as a planned migration. What I mean is, add it as a subpackage, so this can be used soon, and from the start making it clear that once Go updates math/big , then the subpackage won't be needed any more

anitgandhi · 2017-07-17T17:00:28Z

Agreed - I'd prefer it to not be dependent on the Go version.

I like the migration option long term, but as it is that's dependent on upstream Go.

I tested the base62 support ( #1 ) in a branch by copying the standard math/big into fpe/big and making the changes there, and it worked fine. However, I'm a little uncomfortable importing an entire existing library as a sub-package like that and then modifying; potential license conflicts and other issues.

Further, math/big has a lot of other functions that aren't relevant to this package.

I think for now what I will do is use only the parts of math/big and potentially re-write as well. This may provide the opportunity for further performance optimizations as well :)

anitgandhi · 2017-08-29T19:16:48Z

@fmpwizard discussion is here: golang/go#21558

fmpwizard · 2017-10-07T02:53:57Z

I see that the linked ticket was closed, but they didn't go with the change that would allow arbitrary alphabets, right?

anitgandhi · 2017-10-10T20:21:52Z

@fmpwizard that is correct, they added base62 support which basically fixes #1 , which is nice; but not full arbitrary alphabets. The Go team's comments on the linked issue explains the reasoning behind that.

I do still want to add arbitrary alphabet support, especially seeing your use case above. I may just have to add separate func (z *big.Int) TextWithCharset(charset string) and func (z *Int) SetStringWithCharset(s string, charset string) that only deal with fpe related use-cases. As the Go team mentioned, it would introduce complexities with encoding/normalization that is beyond the scope of the current need. It may end up being as simple as copying the old internal digits function from <Go1.5 and modifying it appropriately; it may not be.

If you have any thoughts on how to do this effectively, please let me know!

fmpwizard · 2017-10-10T20:34:13Z

@anitgandhi thanks for following up on this topic. I can more or less follow the ideas around implementing this, but not enough to propose an actual changeset, let alone doing it effectively.

fmpwizard · 2018-03-12T13:03:56Z

friendly ping

anitgandhi · 2018-04-24T01:27:04Z

Sadly I haven't had the bandwidth to work on this issue 😢

As a note to my future self (or anyone who wants to take this up), implementing this will have to limit to ASCII strings so the []byte <--> string conversions will hold up. UTF-8 and other aspects would complicate it.

fmpwizard · 2018-04-24T03:56:23Z

I was looking at implementing this but had a question:

[...] digits function from <Go1.5

I pulled the go repo, checked out go1.4.3 and I didn't find any digits func, I'm guessing that wasn't the name I should look for, right?

anitgandhi · 2018-04-24T14:17:00Z

Whoops, it was called string: https://github.com/golang/go/blob/release-branch.go1.4/src/math/big/nat.go#L737

That code already has a check for effectively ensuring ASCII/base256 which should get the job done.

fmpwizard · 2018-04-24T14:22:55Z

Thanks! I'll take a look and post back with results

UzairM · 2018-04-27T14:08:53Z

any update on this

fmpwizard · 2018-04-27T15:02:14Z

I think I finally figure out what I'm supposed to do here, should have some time to work on this over the weekend

fmpwizard · 2018-05-08T23:39:03Z

I went the path of copying the string unexported function from the math/big package, but that function depended on a chain of functions that pretty much meant copying/pasting a large portion of golang's math/big (including some assembly code) . @anitgandhi you went this path before:

I tested the base62 support ( #1 ) in a branch by copying the standard math/big into fpe/big and making the changes there, and it worked fine. However, I'm a little uncomfortable importing an entire existing library as a sub-package like that and then modifying; potential license conflicts and other issues.

Is the license issue something that capitalone's lawyers may bring up or are you worried about the golang side? if the golang side, I can ask on the mailing list, you'll know best from the capitalone side.

From my reading of the math/big code, it looks pretty well optimized already or at least I don't think I could come up with something better than what's there already (where I could say, this is my original code)

anitgandhi · 2018-05-09T23:44:22Z

In hindsight, the licensing aspect is probably more trivial as compared to the issue of maintaining the separate copy from the upstream. Personally I don't like it when things get out of sync like that.

What about a character substitution model? I suppose it would still be limited to base62 but a utility that takes an input and output alphabet, and just does a straight character substitution could be a first step.

fmpwizard · 2018-05-09T23:54:47Z

the out of sync is a real issue, I was worried about how to make sure we would port any bug fixes, etc, it's possible, but at times may not be the best way to use time.

The character substitution sounds interesting.

With the base62 model, we can now support

0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

in my particular use case, I can limit it to either all upper or lowercase letter, plus 0-9 and then ideally these:

,./;'[]\=-<>?:"{}|+_)(*&^%$#@!`~)

if I urlencode the string I get from my source, mixed with only really using 0-9 and A-B, base62 would be enough I think

Is this what you had in mind?

anitgandhi · 2018-05-10T00:20:21Z

Also iirc in the 1.9 release, math/big started using some math/bits which I think uses internal/cpu . Might be wrong on that tho

And yes thats exactly what I was thinking. For every n characters you don't use out of base62, you get a different n "special" chars.

Which I suppose doesn't even have to be in this package since it's just a character map

martinwaite · 2018-12-15T11:27:37Z

Another take on this is not get involved in the alphabet at all.

Instead, define fpe encrypt as a function that takes an array of uint16 - so supporting radix 1..65536.

Instead of using Int.SetString() to build the binary string, use the NUMradix(X) function defined in https://nvlpubs.nist.gov/nistpubs/specialpublications/nist.sp.800-38g.pdf

Something like:

func num(s []uint16, radix uint64) (*big.Int, *big.Int, error) {
        var big_radix, max, bv, x big.Int
        if radix > 65536 {
                return nil, nil, fmt.Errorf("Radix (%d) too big: max supported radix is 65536", radix)
        }

        maxv := uint16(radix - 1)
        big_radix.SetUint64(uint64(radix))
        max.SetInt64(1)

        for i, v := range s {
                if v > maxv {
                        return nil, nil, fmt.Errorf("Value at %d out of range: got %d - expected 0..%d", i, v, maxv)
                }
                bv.SetUint64(uint64(v))
                x.Mul(&x, &big_radix)
                x.Add(&x, &bv)
                max.Mul(&max, &big_radix)
        }
        return &x, &max, nil
}

This example also calculates max which is radix to power of len(s), for use later in encrypt to compute the modulus.

A corresponding function is required to convert the encryption result back to an array of uint16. The NIST paper defines a STRradix function to do that.

With these encryption interfaces in place, it is trivial to convert strings in fairly large alphabets (65536 members) into arrays of uint16 and back again.

I'll prepare a PR when I get time for you look at.

anitgandhi · 2018-12-17T16:06:54Z

@martinwaite interesting, that looks like a good way to solve the problem. I'd be happy to review and merge a PR whenever. Since your solution still uses the math/big package we still get the advantages of any optimizations made in that standard library, so that's excellent.

Really my only conditions for the PR is there isn't a significant performance regression, but we can tackle that as issues come up 👍

anitgandhi added the enhancement label Jun 29, 2017

anitgandhi added this to the Future milestone Jun 29, 2017

anitgandhi self-assigned this Jun 29, 2017

anitgandhi changed the title ~~Arbitrary alphabet support~~ Arbitrary alphabet and radix support Jun 29, 2017

anitgandhi mentioned this issue Aug 22, 2017

proposal: math/big: base62 and/or arbitrary alphabet and base support golang/go#21558

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arbitrary alphabet and radix support #2

Arbitrary alphabet and radix support #2

anitgandhi commented Jun 29, 2017 •

edited

Loading

fmpwizard commented Jul 17, 2017

anitgandhi commented Jul 17, 2017

fmpwizard commented Jul 17, 2017

anitgandhi commented Jul 17, 2017

anitgandhi commented Aug 29, 2017

fmpwizard commented Oct 7, 2017

anitgandhi commented Oct 10, 2017

fmpwizard commented Oct 10, 2017

fmpwizard commented Mar 12, 2018

anitgandhi commented Apr 24, 2018

fmpwizard commented Apr 24, 2018

anitgandhi commented Apr 24, 2018

fmpwizard commented Apr 24, 2018

UzairM commented Apr 27, 2018

fmpwizard commented Apr 27, 2018

fmpwizard commented May 8, 2018

anitgandhi commented May 9, 2018

fmpwizard commented May 9, 2018 •

edited

Loading

anitgandhi commented May 10, 2018

martinwaite commented Dec 15, 2018 •

edited

Loading

anitgandhi commented Dec 17, 2018

Arbitrary alphabet and radix support #2

Arbitrary alphabet and radix support #2

Comments

anitgandhi commented Jun 29, 2017 • edited Loading

Description of Enhancement

fmpwizard commented Jul 17, 2017

anitgandhi commented Jul 17, 2017

fmpwizard commented Jul 17, 2017

anitgandhi commented Jul 17, 2017

anitgandhi commented Aug 29, 2017

fmpwizard commented Oct 7, 2017

anitgandhi commented Oct 10, 2017

fmpwizard commented Oct 10, 2017

fmpwizard commented Mar 12, 2018

anitgandhi commented Apr 24, 2018

fmpwizard commented Apr 24, 2018

anitgandhi commented Apr 24, 2018

fmpwizard commented Apr 24, 2018

UzairM commented Apr 27, 2018

fmpwizard commented Apr 27, 2018

fmpwizard commented May 8, 2018

anitgandhi commented May 9, 2018

fmpwizard commented May 9, 2018 • edited Loading

anitgandhi commented May 10, 2018

martinwaite commented Dec 15, 2018 • edited Loading

anitgandhi commented Dec 17, 2018

anitgandhi commented Jun 29, 2017 •

edited

Loading

fmpwizard commented May 9, 2018 •

edited

Loading

martinwaite commented Dec 15, 2018 •

edited

Loading