Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"китаб" instead of "китап" #28

Open
mansayk opened this issue Jan 22, 2019 · 9 comments
Open

"китаб" instead of "китап" #28

mansayk opened this issue Jan 22, 2019 · 9 comments
Assignees
Labels
help wanted Extra attention is needed lexc

Comments

@mansayk
Copy link
Member

mansayk commented Jan 22, 2019

@IlnarSelimcan, I commented the lemma "китаб" (only "китап"), because it is not valid in modern Tatar language. If you want to use it for some old texts, maybe there are some special markers to exclude them from compiling in regular mode. I'm sure there are other archaic/historical words that we should take care of.
@jonorthwash, @ftyers, what is the best way here?

@jonorthwash
Copy link
Member

I think the <err_orth> approach might make sense for archaic forms? Or should we do something different?

@ftyers
Copy link
Member

ftyers commented Jan 23, 2019

I think err_orth would be ok, or maybe use_arch ?

@mansayk
Copy link
Member Author

mansayk commented Jan 23, 2019

So, I make it this way in lexc file:
китаб:китаб N1 ; ! "use_arch"
or
китаб:китаб N1 ; ! "err_orth"
right?

@ftyers
Copy link
Member

ftyers commented Jan 23, 2019

! Use/Arch

! Err/Orth

@mansayk
Copy link
Member Author

mansayk commented Jan 23, 2019

I tried to use
китаб:китаб N1 ; ! "" ! Use/Arch
but it doesn't have any effect.
echo 'китаб' | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt ^китаб/китаб<n><sg><nom>$
There is no additional tag or any other difference.

@ftyers
Copy link
Member

ftyers commented Jan 23, 2019

@mansayk what should the result be ?

@mansayk
Copy link
Member Author

mansayk commented Jan 23, 2019

Let's suppose we have word "китабым". And as lemma I get "китаб" instead of correct "китап". And additional tag <err_orth> here won't help, because I don't get correct lemma. I need to remove these words (marked as Err/Orth or Use/Arch) from analyze at all.

@jonorthwash
Copy link
Member

@ftyers, @mansayk, you seem not to be communicating well. Let me try to help...

@mansayk, I believe that the Use/Arch functionality does not yet exist, and @ftyers is offering to implement it. If it were to work as expected, I believe the correct entry would be the following:

китап:китаб N1 ; ! "book"  ! Use/Arch

@ftyers, I believe you'd want the output of analysis to be something like the following. @mansayk, could you confirm that this makes sense to you as well?

^китаб/китап<n><sg><nom><use_arch>$

@jonorthwash jonorthwash added help wanted Extra attention is needed lexc labels Jan 27, 2019
@mansayk
Copy link
Member Author

mansayk commented Jan 28, 2019

Yes,
^китаб/китап<n><sg><nom><use_arch>$
seems good for me. It gives correct lemma and has special tag 'use_arch'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed lexc
Projects
None yet
Development

No branches or pull requests

4 participants