-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
typeanswer: [type:nc] – ignores combining characters #3422
Conversation
0cd34f0
to
1922d25
Compare
Thanks twwn, I'll try to review this soon. (For my own reference, previous PR: #680) |
331f027
to
817e719
Compare
The minor commits are suggestions, each with a link to relevant Stack Overflow threads in their messages.
I'll close this PR & reopen a new one after review, the incremental rebases are rather tedious. |
If you could bring the branch up to date with main (either here or in a new PR if that's easier), that'll make it easier for me to review. |
32132e7
to
14df674
Compare
Done. The small commits are just suggestions. I've split the NFKD commit further: The second one is only there if you want the extra function gone. But since that requires adding a bool parameter, I myself would rather not actually merge it. |
Thanks, I'll get back to this again soon. In case you didn't see it, there's been a report about issues with the recent changes: https://forums.ankiweb.net/t/anki-24-10-beta/49989/22 |
That report, it's since I can…
… each option for both cases, or mix them. The original behavior is restored with It's a moot argument if changes are not to break anything, but IMHO it's really bad card design to clutter the field that you type with additional tags or even media. That's why I feel like the most natural way (if that word makes sense here) is to add |
So HTML tags like bold would appear when nothing is typed, and not appear otherwise? I fear that inconsistency might lead to some additional "bug" reports.
If you're saying that you think users should put those things in a different field that's not being typed in, I fully agree, though sadly it seems to be quite common to put things like sound tags in the same field. |
Yes, it would break with the old behavior. But it's just a suggestion, I entirely understand if it's not deemed good/worthwhile. I guess I'll just restore |
Sounds good, and no worries; it'll likely be a few days before I can build another beta anyway. |
Fix: Add prepare_expected back in for the 'nothing typed' & 'correctly typed' cases. This also makes expected_original redundant again. Style: %s/provided/typed/g Style: rename one ch → c Testcase: whitespace_is_trimmed: added a check for the "correctly typed" path and renamed it to tags_removed (there's no whitespace?) Testcase: empty_input_shows_as_code: changed to also check that tags get trimmed
14df674
to
0734dff
Compare
Adds a comparison variant to [type] which ignores when combining characters of the expected field are missing from the provided input. It still shows these characters in the 'expected' line for reference. It's useful for languages with e.g. diacritics that are required for reference (such as in dictionaries), but rarely actually learned or used in everyday writing. Among these languages: Arabic, Hebrew, Persian, Urdu. The bool 'combining' controls it as new final parameter of both relevant compare_answer functions. On the Python side, it's set to true by default. Use on the note templates: [type:nc:field] (only the front needs to include :nc) This also removes the need to have both variants of words/sentences present as separate fields, to show them redundantly, etc.
Requires adjusting two testcases, but both render exactly the same in Anki itself. On NFC vs. NKFD: https://stackoverflow.com/a/77432079
…parameter) I'd prefer to keep this extra method.
Should get rid of most relocations, at the expense of over-allocating. On Vec's (String's) behavior: https://stackoverflow.com/a/72787776
0734dff
to
9fbacbf
Compare
Thank you for the atomic commits and multiple PRs, that has made my life a lot easier. :-) Regarding the NFKD change, I'm a bit worried this might have unintended side effects on other inputs, but hopefully if that's the case it'll become apparent during beta testing. The micro optimisation was probably unnecessary here, but good practice for hotter code paths. I've only reverted it because it caused a conflict for the commit above it. |
Am I correct in assuming that https://forums.ankiweb.net/t/anki-24-10-beta/49989/143 is caused by the NFKD simplification? If that change has caused us to output a mix of NFC (note text) and NFD, that seems suboptimal. Thoughts? |
I would have said: Although this deck's issue could be from unsupported Javascript, we can easily revert to NFC for the default comparison. But that deck has another issue: It doesn't at all use Seems to me to be a serious issue of spaghetti code that banked on previous broken use not blowing up. Even if this unsupported special case is accounted for again, it's possible to just break again in other unrelated ways in the future. Also, I don't think we're using a mix of NFC & NFKD to begin with? edit: Rewritten. |
Alright, I dunno why returning to NFC didn't work out the first time when I wrote the prior comment's versions. I've now recompiled a fresh checkout and it actually does seem to fix it. So I'd for now propose to use NFC instead throughout:
So I'll revert (into the open PR) and just keep testing. Best case: It just works for both comparisons. Worst case: NFKD returns only for the NC comparison. Unrelated: I asked the protobuf maintainers about an Anki startup crash. They say the bug is within Anki. |
(PR #3482) |
I agree we don't want to be spending much time looking into issue with third-party JS, but I was concerned in this case that it might be indicative of a larger problem. Anki does normalize fields into NFC by default, so outputting the typing comparison in NFC is a good idea. I'm not sure what the crash is about. Did you try Anki's binary build? You could try figuring out which exact Python protobuf call is triggering the crash, and then see if feeding the same data into the call manually (outside of Anki) recreates it. |
I haven't observed the crash with the binary builds (rc2/rc3), they may yet be unaffected. But even on my distro build, the crash occurs frequently but not consistently, as if caused by some race condition. I wish the error message gave an actual pointer as to where to look, but it's left me clueless. Anyway, Arch and Anki seem to be using the same protobuf version? If I understand correctly, the only difference would be that the former builds from the general tarball, and the latter only pulls the python bindings from PyPi. |
IIRC, Arch's |
Yes. I just tried with The former invokes edit: My only guess would be that it's related to the startup sync. Since when I kill my connection, Anki starts fine, only complaining…
… and when I then reestablish the connection, Anki (no matter if |
Building on my previous PR's cleanup, adds a comparison variant to
[type]
which ignores when combining characters of the expected field are missing from the typed input. It still shows these characters in the "expected" line for reference.It's useful for languages with e.g. diacritics that are required for reference (such as in dictionaries), but rarely actually learned or used in everyday writing. Among these languages: Arabic, Hebrew, Persian, Urdu.
Previously:
Now:
[This also removes the need to have both variants of words/sentences present as separate fields, to show them redundantly, etc.]
The bool
combining
controls it as new final parameter of both relevantcompare_answer
functions. In Python, it's set to true by default.Use on the note templates:
[type:nc:field]