-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random results from foliavalidator and folia2txt #100
Comments
I can reproduce with a much smaller file too: |
I'm not sure whether the example file is a real-world example or just an example to illustrate this issue, but I see two problems:
|
I'm still a bit puzzled what caused the actual randomness though.. |
Actually, I think I the randomness may have been caused by the unordered map from which text classes were read, that order was not guaranteed and behaviour differed based on which was checked first.. |
Ok, buggy2.txt was just a made-up excerpt. And the use of class-names is questionable, but still: I think it is valid FoLiA, and foliavalidator should be deterministic |
Yes, the non-deterministic quirk is definitely not good. I did commit a fix for that. |
But the fix isn't released, so I wonder if that helps @martinreynaert using LaMachine? |
released now |
I am quite sure this is NOT solved completely yet.
|
Addition:
Which puzzles me too. |
Agreed, the document looks okay, the validator seems to miss a word entirely when getting the deep text..
That's a legit error, Original can only come with New, not with Current (Current (and New) can come with Suggestion). |
Ok, I will add this limitation to libfolia too. |
There was already a failing test case I missed that covers this issue actually, the release was a bit too premature... |
Done. Needed quite a bit of rework, unfortunately |
This should now be solved, the tests are all green again and the example you gave also validates. Has been released as foliapy v2.5.3. |
Still, I assume something is wrong. see: Here the (unlogical) correction from class="current" to class="Ticcl" is made. |
yes, you're right, there's still some assumption that the "current" class is always the latest/most-current class. I'll see if we can get rid of that without breaking anything else. |
I'm not sure about the Python implementation, but libfolia checks the text for every textclass in the document. |
yeah, foliapy does the same |
given this file:
buggy.txt
foliavalidator and folia2text give random outcome:
The text was updated successfully, but these errors were encountered: