You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Clearly the sentence is tokenized by ucto on the 'other' textclass, but there is now way to express this.
A simple solution would be to allow for textclass on the word level:
textclass on w seems a decent solution to make the information on what text was the source for tokenisation explicit, I'm not sure if it's even needed for other structural elements then.
Multiple mutually exclusive structural nodes would be quite a change and problematic, with regard to backward compatibility especially. It relates to the current limitation that FoLiA can't really deal with multiple tokenisations, revising that would be a major operation (FoLiA v2.0?) and I'm not even sure that this would be the way to go about it.
I agree that this has a major impact, and needs more thought.
So lets introduce the textclass on words only for now.
I would strongly suggest to add a constraint too:
IF a word has an explicit textclass attribute
THEN it may have 1 textcontent child only, of the same class.
This assures that when software starts to use this new feature, it never construct FoLiA which would need repair in the future.
For backward compatibility, we need to accept words without an explicit textclass and several textcontents in different classes. (the latter is enforced already)
A folia 2.0 would need to repair these.
consider the following FoLiA fragment:
Clearly the sentence is tokenized by ucto on the 'other' textclass, but there is now way to express this.
A simple solution would be to allow for textclass on the word level:
But this raises some questions on the 'orphaned' current text. Wouldn't it be better to have these connected to another word? like this:
This could also be raised to the sentence level then:
This might be a solution for the problem of multiple/different tokenizations in one FoLiA document.
But again it raises questions:
The text was updated successfully, but these errors were encountered: