Determines the monospace display width of a string in Ruby. Implementation based on EastAsianWidth.txt and other data, 100% in Ruby. Other than wcwidth(), which fulfills a similar purpose, it does not rely on the OS vendor to provide an up-to-date method for measuring string width.
Unicode version: 9.0.0
Guesing the correct space a character will consume on terminals is not easy. There is no single standard. Most implementations combine data from East Asian Width, some General Categories, and hand-picked adjustments.
Further at the top means higher precedence. Please expect changes to this algorithm with every MINOR version update (the X in 1.X.0)!
Width | Characters | Comment |
---|---|---|
X | (user defined) | Overwrites any other values |
-1 | "\b" |
Backspace (total width never below 0) |
0 | "\0" , "\x05" , "\a" , "\n" , "\v" , "\f" , "\r" , "\x0E" , "\x0F" |
C0 control codes that do not change horizontal width |
1 | "\u{00AD}" |
SOFT HYPHEN |
2 | "\u{2E3A}" |
TWO-EM DASH |
3 | "\u{2E3B}" |
THREE-EM DASH |
0 | General Categories: Mn, Me, Cf (non-arabic) | Excludes ARABIC format characters |
0 | "\u{1160}".."\u{11FF}" |
HANGUL JUNGSEONG |
2 | East Asian Width: F, W | Full-width characters |
1 or 2 | East Asian Width: A | Ambiguous characters, user defined, default: 1 |
1 | All other codepoints | - |
Install the gem with:
gem install unicode-display_width
Or add to your Gemfile:
gem 'unicode-display_width'
require 'unicode/display_width'
Unicode::DisplayWidth.of("⚀") # => 1
Unicode::DisplayWidth.of("一") # => 2
The second parameter defines the value returned by characterrs defined as ambiguous:
Unicode::DisplayWidth.of("·", 1) # => 1
Unicode::DisplayWidth.of("·", 2) # => 2
You can overwrite how to handle specific code points by passing a hash (or even a proc) as third parameter:
Unicode::DisplayWidth.of("a\tb", 1, 0x09 => 10)) # => 12
Activated by default. Will be deactivated in version 2.0:
require 'unicode/display_width/string_ext'
"⚀".display_width #=> 1
'一'.display_width #=> 2
You can actively opt-out from the string extension with: require 'unicode/display_width/no_string_ext'
Use this one-liner to print out display widths for strings from the command-line:
$ gem install unicode-display_width
$ ruby -r unicode/display_width -e 'puts Unicode::DisplayWidth.of $*[0]' -- "一"
Replace "一" with the actual string to measure
- Python: https://github.com/jquast/wcwidth
- JavaScript: https://github.com/mycoboco/wcwidth.js
- C: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
- C for Julia: JuliaStrings/utf8proc#2
See unicode-x for more Unicode related micro libraries.
- Copyright (c) 2011, 2015-2016 Jan Lelis, http://janlelis.com, released under the MIT license
- Early versions based on runpaint's unicode-data interface: Copyright (c) 2009 Run Paint Run Run
- Unicode data: http://www.unicode.org/copyright.html#Exhibit1