incorrect spacing in REPL for combining characters #6939

stevengj · 2014-05-23T17:52:06Z

If you type e.g. \alpha<TAB>\hat<TAB>, it makes α̂. However, on my machine (MacOS) it displays an extra space after the character, which weirdly disappears when you hit <RETURN> or <TAB>.

cc: @loladiro

The text was updated successfully, but these errors were encountered:

aelg · 2014-05-27T08:55:30Z

Also doing julia> \hat<TAB> puts the ^ on the > in the prompt.

stevengj · 2014-05-27T12:13:43Z

@aelg, \hat generates a Unicode combining character, which applies the hat to the previous character. So it isn't going to work quite like LaTeX (which applies the hat to the subsequent character).

Keno · 2014-05-27T12:15:07Z

We could probably put some kind of noncombining separator after the prompt though to prevent that from happening.

aelg · 2014-05-27T12:22:19Z

@stevengj No I understand that, I just thought it was worth mentioning, as it's probably not what anyone would want. It seems related enough, to the bug you reported, to mention it here instead of creating a new issue.

Keno · 2014-06-09T17:36:04Z

What should the behavior for navigating across combining characters be?
Some options:

Do the same thing as now and just fix the display. This would mean that a|^ and a^| (where ^ indicates combining hat and | indicates cursor position) look the same.
Step over the combined character.
Do something more fancy such as splitting the combining characters when the cursor is in between them.

JeffBezanson · 2014-06-09T17:50:35Z

Option 1 sounds good.

stevengj · 2014-06-09T18:00:23Z

Option 2 sounds better to me. (But I'd still like to have to hit twice to delete the combined character, so that I can delete just the decoration.) But option 1 should be fine for now.

Note that utf8proc will identify graphemes for you, if you want to move the cursor in units of graphemes.

carlobaldassi · 2014-06-09T18:02:29Z

FWIW in vim the behaviour is option 2. I don't know about other editors but it should be easy to test now that all of them implement latex substitution.

Keno · 2014-06-09T18:02:38Z

I've been using option 1 for the past 5 minutes and I hate it, so I'll try option 2 now.

Keno · 2014-06-09T18:03:54Z

Any by that I mean just navigation. Deletion will still delete the combining character.

Fix #6939

stevengj · 2014-06-10T18:47:00Z

Still doesn't work for me in MacOS 10.8.5 Terminal. Typing x\hat<TAB> gives an extra space (and then typing subsequent characters, arrow keys, delete etc. is wonky).

Keno · 2014-06-10T18:53:30Z

Odd, let me see.

Keno · 2014-06-10T18:58:30Z

Works for me on OS X 10.9.3. What is strwidth(string(:x\hat<TAB>)) ?

stevengj · 2014-06-10T19:37:07Z

strwidth(string(:x̂)) gives 2.

stevengj · 2014-06-10T19:38:53Z

charwidth(char(0x0302)) returns 1 for me. Looks like the wcwidth function may not be trustworthy on MacOS 8?

StefanKarpinski · 2014-06-10T19:40:08Z

You must mean OS X 10.8, right, not actually MacOS 8? (MacOS 8 predates Unicode.)

Keno · 2014-06-10T19:40:17Z

Ah, that's wrong. Maybe we should include the appropriate table? Last time the policy that @StefanKarpinski proposed on that was "Get a better OS", but maybe now that it's important that's different.

StefanKarpinski · 2014-06-10T19:41:02Z

Haha. I can't be held to every asinine thing I've ever said ;-)

jiahao · 2014-06-10T19:42:41Z

At least what you said wasn't "arsenate".

stevengj · 2014-06-10T19:45:47Z

We are already using a replacement wcwidth function (src/support/wcwidth.c) for Windows, as I understand it. Maybe just use it elsewhere as well? At least on MacOS 10.8 and earlier (including MacOS 8, of course).

(Though it might be a bit out of date; it looks like it needs to be updated for Unicode 6.)

stevengj · 2014-06-10T19:50:38Z

Or maybe we should just use utf8proc to get the unicode category, and assign a width of 0 to combining characters and 1 to everything else?

@jiahao, does the latest REPL handle CJK characters sensibly if they are assigned a charwidth of 2 (which is what src/support/wcwidth.c seems to do)?

jiahao · 2014-06-11T18:19:13Z

I haven't noticed much craziness with displaying CJK characters. Korean input however relies heavily on combining vowels and consonants (which can be input separately) into syllables (which are rendered as individual characters); those should be doublewidth.

stevengj · 2014-06-11T21:00:10Z

@jiahao, we might only use our custom wcwidth on Windows. What is the charwidth of a CJK character on your machine?

stevengj · 2014-06-12T16:51:27Z

It would be nice if we could get this from utf8proc, but I don't see a charwidth there at first glance. Of course, first utf8proc has to be updated for Unicode 6, and maybe at the same time its database could be updated to include character widths. (Unfortunately, there is no public version-control repository for utf8proc, although the author told me in February that he was willing to do so, pending some cleanup.)

Keno · 2014-06-12T16:52:17Z

Yes, that would be ideal.

Checks output of charwidth against latest Unicode charcater tables (see UAX #11) Ref: JuliaLang/julia#6939

mbauman · 2014-07-02T21:17:12Z

Hrm. Now some of the super- and sub-script latex characters are behaving funny in the REPL, too. Mac OS 10.9.2 seems to think that all super- and sub-script letters have width 0. Symbols and numbers seem to be ok, though.

julia> charwidth('ᴿ')
0

julia> charwidth('ᵦ')
0

julia> charwidth('₁')
1

julia> charwidth('⁺')
1

I haven't had a chance to figure out when this broke, but I'm pretty sure this worked at one point.

stevengj · 2014-07-03T16:35:17Z

@mbauman, charwidth('ᴿ') == 0 on MacOS 10.7.5 as well.

mbauman · 2014-07-03T17:07:05Z

Thinking about this more, I bet a bisect would blame the fix for this issue (953a1d4). These super- and sub-scripts are just collateral damage in making combining characters work properly.

stevengj · 2014-07-03T17:19:22Z

@mbauman, I don't follow you. The charwidth function is simply a thin wrapper around the wcwidth C library function, so I don't see how it would have been affected by REPL patches.

What is happening seems to be that the OS X wcwidth function is simply buggy. (On GNU/Linux, charwidth('ᴿ') returns 1.) And on Windows the wcwidth function is utterly broken because it takes a 16-bit argument, so it has no chance of working except for a subset of Unicode (the BMP), which is why on Windows we already use our own wcwidth replacement function.

mbauman · 2014-07-03T17:21:09Z

Yup, exactly. It's just that (I think) the REPL didn't honor charwidth until that patch (actually, maybe it was a different patch; I haven't looked closely at the changes). It's the correct behavior… it just stinks that we need to work around buggy implementations.

stevengj · 2014-12-17T20:56:40Z

Couldn't the charwidth(c) != 0 tests in LineEdit.jl be replaced by isprint(c)?

…uliaLang#6939)

…wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes JuliaLang#6939)

stevengj added REPL labels May 23, 2014

Keno added a commit that referenced this issue Jun 9, 2014

Fix #6939

982f7c7

Keno closed this as completed in 953a1d4 Jun 10, 2014

Keno added a commit that referenced this issue Jun 10, 2014

Merge pull request #7210 from JuliaLang/kf/replcombine

10f5754

Fix #6939

stevengj reopened this Jun 10, 2014

jiahao added a commit to jiahao/jin that referenced this issue Jun 13, 2014

Added quick check of charwidth

c5b5474

Checks output of charwidth against latest Unicode charcater tables (see UAX #11) Ref: JuliaLang/julia#6939

stevengj mentioned this issue Jun 24, 2014

test failure in lineedit.jl #7403

Closed

JeffBezanson mentioned this issue Jul 13, 2014

utf8proc doesn't support unicode 6 #7582

Closed

stevengj mentioned this issue Jul 16, 2014

add charwidth property JuliaStrings/utf8proc#2

Closed

danluu mentioned this issue Sep 8, 2014

make testall fails in replcompletions #8268

Closed

stevengj mentioned this issue Nov 21, 2014

Test failure on OSX #9071

Closed

stevengj mentioned this issue Mar 28, 2015

update utf8proc to 1.2 #10654

Closed

stevengj added a commit to stevengj/julia that referenced this issue Mar 28, 2015

replace wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes J…

40d5719

…uliaLang#6939)

stevengj added a commit to stevengj/julia that referenced this issue Mar 28, 2015

replace wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes J…

7d908f9

…uliaLang#6939)

stevengj mentioned this issue Mar 28, 2015

update utf8proc, replace wcwidth #10659

Merged

stevengj added a commit to stevengj/julia that referenced this issue Mar 29, 2015

replace wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes J…

6bf5e61

…uliaLang#6939)

stevengj added a commit to stevengj/julia that referenced this issue Mar 30, 2015

replace wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes J…

e48d8d0

…uliaLang#6939)

stevengj added a commit to stevengj/julia that referenced this issue Mar 30, 2015

update libmojibake -> utf8proc 1.2 (closes JuliaLang#10654); replace …

baefdfb

…wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes JuliaLang#6939)

stevengj added a commit to stevengj/julia that referenced this issue Mar 30, 2015

update libmojibake -> utf8proc 1.2 (closes JuliaLang#10654); replace …

9f58ecd

…wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes JuliaLang#6939)

stevengj added a commit to stevengj/julia that referenced this issue Mar 30, 2015

update libmojibake -> utf8proc 1.2 (closes JuliaLang#10654); replace …

0fb98b3

…wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes JuliaLang#6939)

stevengj added a commit to stevengj/julia that referenced this issue Mar 30, 2015

update libmojibake -> utf8proc 1.2 (closes JuliaLang#10654); replace …

90c0fb4

…wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes JuliaLang#6939)

stevengj added a commit to stevengj/julia that referenced this issue Mar 30, 2015

update libmojibake -> utf8proc 1.2 (closes JuliaLang#10654); replace …

95e29d2

…wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes JuliaLang#6939)

stevengj added a commit to stevengj/julia that referenced this issue Mar 30, 2015

update libmojibake -> utf8proc 1.2 (closes JuliaLang#10654); replace …

4813b29

…wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes JuliaLang#6939)

stevengj added a commit to stevengj/julia that referenced this issue Mar 30, 2015

update libmojibake -> utf8proc 1.2 (closes JuliaLang#10654); replace …

60aee18

…wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes JuliaLang#6939)

stevengj added a commit to stevengj/julia that referenced this issue Mar 30, 2015

update libmojibake -> utf8proc 1.2 (closes JuliaLang#10654); replace …

c3c0411

…wcwidth by utf8proc_charwidth (fixes JuliaLang#3721, closes JuliaLang#6939)

stevengj closed this as completed in 58578b0 Mar 30, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

incorrect spacing in REPL for combining characters #6939

incorrect spacing in REPL for combining characters #6939

stevengj commented May 23, 2014

aelg commented May 27, 2014

stevengj commented May 27, 2014

Keno commented May 27, 2014

aelg commented May 27, 2014

Keno commented Jun 9, 2014

JeffBezanson commented Jun 9, 2014

stevengj commented Jun 9, 2014

carlobaldassi commented Jun 9, 2014

Keno commented Jun 9, 2014

Keno commented Jun 9, 2014

stevengj commented Jun 10, 2014

Keno commented Jun 10, 2014

Keno commented Jun 10, 2014

stevengj commented Jun 10, 2014

stevengj commented Jun 10, 2014

StefanKarpinski commented Jun 10, 2014

Keno commented Jun 10, 2014

StefanKarpinski commented Jun 10, 2014

jiahao commented Jun 10, 2014

stevengj commented Jun 10, 2014

stevengj commented Jun 10, 2014

jiahao commented Jun 11, 2014

stevengj commented Jun 11, 2014

stevengj commented Jun 12, 2014

Keno commented Jun 12, 2014

mbauman commented Jul 2, 2014

stevengj commented Jul 3, 2014

mbauman commented Jul 3, 2014

stevengj commented Jul 3, 2014

mbauman commented Jul 3, 2014

stevengj commented Dec 17, 2014

incorrect spacing in REPL for combining characters #6939

incorrect spacing in REPL for combining characters #6939

Comments

stevengj commented May 23, 2014

aelg commented May 27, 2014

stevengj commented May 27, 2014

Keno commented May 27, 2014

aelg commented May 27, 2014

Keno commented Jun 9, 2014

JeffBezanson commented Jun 9, 2014

stevengj commented Jun 9, 2014

carlobaldassi commented Jun 9, 2014

Keno commented Jun 9, 2014

Keno commented Jun 9, 2014

stevengj commented Jun 10, 2014

Keno commented Jun 10, 2014

Keno commented Jun 10, 2014

stevengj commented Jun 10, 2014

stevengj commented Jun 10, 2014

StefanKarpinski commented Jun 10, 2014

Keno commented Jun 10, 2014

StefanKarpinski commented Jun 10, 2014

jiahao commented Jun 10, 2014

stevengj commented Jun 10, 2014

stevengj commented Jun 10, 2014

jiahao commented Jun 11, 2014

stevengj commented Jun 11, 2014

stevengj commented Jun 12, 2014

Keno commented Jun 12, 2014

mbauman commented Jul 2, 2014

stevengj commented Jul 3, 2014

mbauman commented Jul 3, 2014

stevengj commented Jul 3, 2014

mbauman commented Jul 3, 2014

stevengj commented Dec 17, 2014