-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Selection with search and unicode #1686
Comments
Seems to work fine for me on mac/master, let me know if you still see it. |
@Tyriar Nope its not gone, still the same here. Maybe its a platform issue? Looks like the accent char is accounted for 2 halfwidth chars by the selector, while the ¥ symbol gets treated as one halfwidth. Found this in the code: xterm.js/src/addons/search/SearchHelper.ts Line 210 in 9e446a9
Imho the last argument should be the sum of wcwidth instead of the string length (not tested yet). |
@jerch are you on Linux? |
Yes Ubuntu 16 here. |
I guess we need to have a setting for this stuff like you were suggesting before. Still not sure the best way of querying the platform for these character widths though, I doubt we can rely on all Linux distros being the same and macOS being a different case. |
Same underlying issue to #1059? |
Nope, this time its not wcwidth's fault, changing the argument I mentioned above fixes the problems (tested a few minutes ago) |
Some background on this: This can be fixed the same way I had to fix the linkifier underlining in #1769, by mapping a string index back to the buffer index: Line 223 in c7fa89d
If done twice (match start and end) the selection will correctly point to the underlying cells. |
This still happens. This is the problem line:
|
Hello, I would like to join my peer @miggs125 in contributing to xterm by tackling this issue. I will first attempt to improve selection of strings that include diacritical marks. |
@Silvyre Sure thing. Note that the terminal buffer already accounts diacritical characters into one cell with the main character, thus the issue comes from the string position to cell back-mapping. |
@jerch Thanks!
Are you referring to the JoinedCellData type? As far as I can tell, this base type is not currently used within search selection (search selection appears to handle buffer cells as objects of IBufferCell type, which is not part of the ICellData hierarchy). |
Starting to get on the same page. OK, modifying _findInLine to return I can't imagine this to be a satisfactory solution, considering that, as you mentioned, this does not work for all surrogate/fullwidth character combinations [across various platforms] (e.g. selection of |
To clarify, it sometimes works, as shown in this GIF, which I created after replacing every instance of |
@Silvyre Yes working with wcwidth correction is the right way to go here. Imho needed once for the search term itself (in case it contains weird chars) to get the amount of cells taken ("cell length"), then you'd need to correct every start offset found likewise to find the real cell offset. That cell-offset + term-cell-length % cols should give the real start and end position in the buffer. |
@jerch Excellent, I'll work on that. Thanks again! |
I have a general question regarding addons and dependencies: how are helper functions in |
@Silvyre They arent yet, the public API gets extended on request. Thus you'd have to go with internal refs for now. Maybe open an issue regarding this so we can decide how and where to put it. |
Sure thing, I'll open an issue. |
@jerch I'm having a bit of a difficult time determining how and where cell offsetting should be (or is) implemented. Within BufferLine.ts? |
I've also noticed that |
Maybe, it's meant to be undefined for various types of selection if I remember right though (word, line, select all). |
Ah yepp thats abit hidden in the codebase, the code regarding this is in Buffer.ts and BufferLine.ts, both contain several methods that demostrate how to walk cells, easiest startpoint might be this: xterm.js/src/common/buffer/Buffer.ts Line 480 in e8153d9
Not sure if you can directly use this method, you have to take care where your string index origin is (whether col 0 of wrapped or unwrapped lines). |
@Tyriar Hi, so the issue has any solutions? I tried to load xterm-addon-unicode11, it can only fix emoji chars viewing but searching for Chinese chars still having the issue. |
Been a while since I looked at this code but I think we could expose the active |
@Tyriar How could we fix or adapt this in our production? Seems it was internal code you write above and I have no idea how should I do... |
Combining, surrogate or fullwidth chars in the line and/or the search string lead to weird selection offset problems. Steps to repro:
echo -en 'combining: ééé\nfullwidth: ¥¥¥\nsurrogate: 𓂀𓂀𓂀\n'
The selection is kinda off for all 3 types, it gets even worse if the line contains any of these before their occurence. It seems the renderer and the selection manager do not agree on the chars widths and lengths.
Since I had a similar problem with the linkifier, it might be fixable the same way (#1678).
The text was updated successfully, but these errors were encountered: