Skip to content

Commit

Permalink
Implement custom alphabet support (#12)
Browse files Browse the repository at this point in the history
* support multiple alphabets including custom alphabets
* drop support for 128 bit binary uuid format
* bump version to 4.0.0
  • Loading branch information
gpedic authored Dec 26, 2024
1 parent 57f0b42 commit d0edc83
Show file tree
Hide file tree
Showing 17 changed files with 1,032 additions and 331 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ jobs:
name: OTP ${{matrix.otp}} / Elixir ${{matrix.elixir}}
strategy:
matrix:
elixir: ["1.14", "1.15"]
otp: ["24.3.4.9", "25.3.2.3"]
elixir: ["1.14", "1.15", "1.16"]
otp: ["25.3.2.3"]
env:
MIX_ENV: test
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,5 @@ shortuuid-*.tar

# Temporary files, for example, from tests.
/tmp/

.elixir_ls
4 changes: 2 additions & 2 deletions .tool-versions
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
elixir 1.15.2-otp-25
erlang 25.3.2.3
elixir v1.18.0-otp-27
erlang 27.2
36 changes: 35 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,41 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]
## v4.0.0 (25.12.2024)

Breaking changes:
* Dropped support for binary UUID input

### Added
* Support for custom alphabets
* Predefined alphabets (base32, base58, base62, base64, etc.)
* `ShortUUID.Builder` module for creating custom ShortUUID modules

### Changed
* Moved core functionality to `ShortUUID.Core`
* Simplified main `ShortUUID` module interface
* Improved error messages and validation

```elixir
# Old v3.x code still works, the ShortUUID module uses the same alphabet as in V3
# If you just want to keep using ShortUUID no changes are required
ShortUUID.encode(uuid)

# New in v4.x you can define use one of a list of predefined alphabets or define your own
defmodule MyUUID do
use ShortUUID.Builder, alphabet: :base58
end

MyUUID.encode(uuid)

defmodule MyCustomUUID do
use ShortUUID.Builder, alphabet: "ABCDEFGHIJKLMNOPQRSTUVWXYZ234567"
end

MyCustomUUID.encode(uuid)
```

## [Released]

## v3.0.0 (15.07.2023)

Expand Down
2 changes: 1 addition & 1 deletion LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# The MIT License

Copyright (c) 2019 Goran Pediฤ‡
Copyright (c) 2024 Goran Pediฤ‡

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
104 changes: 92 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,22 +10,39 @@

<!-- MDOC !-->

ShortUUID is a lightweight Elixir library that generates short and unique IDs for use in URLs. It provides a solution when you need IDs that are easy to use and understand for users.
ShortUUID is a lightweight Elixir library for generating short, unique IDs in URLs. It turns standard UUIDs into smaller strings ideal for use in URLs.
You can choose from a set of predefined alphabets or define your own.
The default alphabet includes lowercase letters, uppercase letters, and digits, omitting characters like 'l', '1', 'I', 'O', and '0' to keep them readable.

Instead of using long and complex UUIDs, ShortUUID converts them into shorter strings using a combination of lowercase and uppercase letters, as well as digits. It avoids using similar-looking characters such as 'l', '1', 'I', 'O', and '0'.
**Note:** Different ShortUUID implementations be compatible as long as they use the same alphabet. However, there is no official standard, so if you plan to use ShortUUID with other libraries, it's a good idea to research and test for compatibility.

**Note:** It's worth noting that different ShortUUID implementations should work together if they use the same set of characters. However, there is no official standard, so if you plan to use ShortUUID with other libraries, it's a good idea to research and test for compatibility.

Unlike some other libraries, ShortUUID doesn't generate UUIDs itself. Instead, you can input any valid UUID into the `ShortUUID.encode/1`. To generate UUIDs, you can use libraries like
Unlike some other solutions, ShortUUID does not produce UUIDs on its own as there are already plenty of libraries to do so. To generate UUIDs, use libraries such as
[Elixir UUID](https://github.com/zyro/elixir-uuid), [Erlang UUID](https://github.com/okeuday/uuid) and also [Ecto](https://hexdocs.pm/ecto/Ecto.UUID.html) as it can generate version 4 UUIDs.

ShortUUID supports common UUID formats and is case-insensitive. It also supports binary UUIDs returned from DBs like PostgreSQL when the uuid type is used to store the UUID.
ShortUUID supports common UUID formats and is case-insensitive.

## Compatibility

Starting with version `v3.0.0`, this library will follow suit with changes in other language implementations and move the most significant bit of the encoded value to the start. This also means that padding will be applied to the end of the string, not the start
This change will restore compatibility with other libraries like [shortuuid](https://github.com/skorokithakis/shortuuid) from v1.0.0 onwards and [short-uuid
](https://github.com/oculus42/short-uuid).
### v4.0.0 breaking changes

Raw binary UUID input (as `<<...>>`) is no longer supported. UUIDs must be provided as strings in standard UUID format (`"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"`) or as 32-character hex strings without hyphens.

Examples of supported formats:
```elixir
# Supported
"550e8400-e29b-41d4-a716-446655440000" # With hyphens
"550e8400e29b41d4a716446655440000" # Without hyphens

# No longer supported in v4.0.0
<<85, 14, 132, 0, 226, 155, 65, 212, 167, 22, 68, 102, 85, 68, 0, 0>>
```

### v3.0.0 breaking changes

Changed bit order and padding behavior to align with other language implementations:
- Most significant bits are now encoded first
- Padding characters appear at the end of the string
- Compatible with Python's [shortuuid](https://github.com/skorokithakis/shortuuid) v1.0.0+ and Node.js [short-uuid](https://github.com/oculus42/short-uuid)

Before `v3.0.0`
```elixir
Expand Down Expand Up @@ -68,7 +85,7 @@ Add `:shortuuid` to your list of dependencies in `mix.exs`:
```elixir
def deps do
[
{:shortuuid, "~> 3.0"}
{:shortuuid, "~> 4.0"}
]
end
```
Expand All @@ -95,6 +112,69 @@ If you would like to use ShortUUID with Ecto schemas try [Ecto.ShortUUID](https:

It provides a custom Ecto type which allows for ShortUUID primary and foreign keys while staying compatible with `:binary_key` (`EctoUUID`).

## Custom Alphabets

Starting with version `v4.0.0`, ShortUUID allows you to define custom alphabets for encoding and decoding UUIDs. You can use predefined alphabets or define your own.

### Restrictions

- The alphabet must contain at least 16 unique characters.
- The alphabet must not contain duplicate characters.

### Predefined Alphabets

Starting with version `v4.0.0`, the following predefined alphabets are available:

- `:base57_shortuuid` - "23456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz"
- `:base32` - "ABCDEFGHIJKLMNOPQRSTUVWXYZ234567"
- `:base32_crockford` - "0123456789ABCDEFGHJKMNPQRSTVWXYZ"
- `:base32_hex` - "0123456789ABCDEFGHIJKLMNOPQRSTUV"
- `:base32_z` - "ybndrfg8ejkmcpqxot1uwisza345h769"
- `:base58` - "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz"
- `:base62` - "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
- `:base64` - "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
- `:base64_url` - "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_"

### Using a custom or predefined alphabet

```elixir
defmodule MyBase58UUID do
use ShortUUID.Builder, alphabet: :base58
end

defmodule MyCustomUUID do
use ShortUUID.Builder, alphabet: "0123456789ABCDEF"
end

iex> MyBase58UUID.encode("550e8400-e29b-41d4-a716-446655440000")
{:ok, "BWBeN28Vb7cMEx7Ym8AUzs"}

iex> MyBase58UUID.decode("BWBeN28Vb7cMEx7Ym8AUzs")
{:ok, "550e8400-e29b-41d4-a716-446655440000"}
```

### Just for fun

Since v4.0.0 alphabets are not limited to alphanumeric characters either

```elixir
defmodule UnicodeUUID do
use ShortUUID.Builder, alphabet: "๐ŸŒŸ๐Ÿ’ซโœจโญ๏ธ๐ŸŒ™๐ŸŒŽ๐ŸŒ๐ŸŒ๐ŸŒ‘๐ŸŒ’๐ŸŒ“๐ŸŒ”๐ŸŒ•๐ŸŒ–๐ŸŒ—๐ŸŒ˜"
end

iex> UnicodeUUID.encode("550e8400-e29b-41d4-a716-446655440000")
{:ok, "๐ŸŒŽ๐ŸŒŽ๐ŸŒŸ๐ŸŒ—๐ŸŒ‘๐ŸŒ™๐ŸŒŸ๐ŸŒŸ๐ŸŒ—โœจ๐ŸŒ’๐ŸŒ”๐ŸŒ™๐Ÿ’ซ๐ŸŒ–๐ŸŒ™๐ŸŒ“๐ŸŒ๐Ÿ’ซ๐ŸŒ๐ŸŒ™๐ŸŒ™๐ŸŒ๐ŸŒ๐ŸŒŽ๐ŸŒŽ๐ŸŒ™๐ŸŒ™๐ŸŒŸ๐ŸŒŸ๐ŸŒŸ๐ŸŒŸ"}


defmodule SmileyUUID do
use ShortUUID.Builder, alphabet: "๐Ÿ˜€๐Ÿ˜Š๐Ÿ˜„๐Ÿ˜๐Ÿฅฐ๐Ÿ˜˜๐Ÿ˜œ๐Ÿคช๐Ÿ˜‹๐Ÿค”๐Ÿ˜Œ๐Ÿง๐Ÿ˜๐Ÿ˜‘๐Ÿ˜ถ๐Ÿ˜ฎ๐Ÿ˜ฒ๐Ÿ˜ฑ๐Ÿ˜ด๐Ÿฅฑ๐Ÿ˜ช๐Ÿ˜ข๐Ÿ˜ญ๐Ÿ˜ค๐Ÿ˜Ž๐Ÿค“๐Ÿ˜‡๐Ÿ˜ˆ๐Ÿ‘ป๐Ÿ‘ฝ๐Ÿค–๐Ÿคก๐Ÿ’€"
end

iex> SmileyUUID.encode("550e8400-e29b-41d4-a716-446655440000")
{:ok, "๐Ÿ˜Š๐Ÿคช๐Ÿ˜ข๐Ÿ˜˜๐Ÿ’€๐Ÿฅฐ๐Ÿ˜ฒ๐Ÿ˜Š๐Ÿคก๐Ÿค–๐Ÿค”๐Ÿ˜Š๐Ÿ˜˜๐Ÿ˜ค๐Ÿ‘ฝ๐Ÿค“๐Ÿ‘ป๐Ÿ˜Š๐Ÿ‘ฝ๐Ÿ˜ฒ๐Ÿ˜‹๐Ÿ˜€๐Ÿ˜ญ๐Ÿ˜‡๐Ÿ˜ฒ๐Ÿค–"}

```

## Documentation

Look up the full documentation at [https://hexdocs.pm/shortuuid](https://hexdocs.pm/shortuuid).
Expand All @@ -105,9 +185,9 @@ Inspired by [shortuuid](https://github.com/skorokithakis/shortuuid).

## Copyright and License

Copyright (c) 2019 Goran Pediฤ‡
Copyright (c) 2024 Goran Pediฤ‡

This work is free. You can redistribute it and/or modify it under the
terms of the MIT License.

See the [LICENSE.md](./LICENSE.md) file for more details.
See the [LICENSE.md](./LICENSE.md) file for more details.
4 changes: 0 additions & 4 deletions bench/encode.exs
Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
Benchee.run(%{
"encode/1 binary uuid" => fn ->
ShortUUID.encode(<<1, 96, 40, 15, 29, 112, 21, 104, 176, 151, 123, 220, 162, 128, 29, 227>>)
end,

"encode/1 hyphenated uuid string" => fn ->
ShortUUID.encode("2a162ee5-02f4-4701-9e87-72762cbce5e2")
end,
Expand Down
Loading

0 comments on commit d0edc83

Please sign in to comment.