Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Token ids generated instead of translation #3

Open
ahmedoumar opened this issue Jul 7, 2022 · 6 comments · May be fixed by #5
Open

Token ids generated instead of translation #3

ahmedoumar opened this issue Jul 7, 2022 · 6 comments · May be fixed by #5

Comments

@ahmedoumar
Copy link

ahmedoumar commented Jul 7, 2022

Hey there, I hope you're doing fine.
when running the command: turj.translate
it returns the token ids instead of the actual translation?
(see the output below)
2022-07-07 10:41:43 | INFO | turjuman.translate | Using beam search
tensor([[ 0, 6538, 2, 76, 6380, 1]])

@elmadany
Copy link
Member

elmadany commented Jul 7, 2022

Hi Ahmed,
could you please provide us with more details such as your input sentence and screenshot?
Thanks

@ahmedoumar
Copy link
Author

Screenshot from 2022-07-07 12-02-16
as you can see the turj.translate returns output ids instead of translation, i have solved this by using the tokenizer and then decode the ids back to tokens:
tokenizer.decode(target, skip_special_tokens=True, clean_up_tokenization_spaces=True)

@elmadany
Copy link
Member

elmadany commented Jul 7, 2022

To integrate Turjuman with your python code, take a look at this notebook.
https://colab.research.google.com/github/UBC-NLP/turjuman/blob/main/examples/Integrate_turjuman_with_your_code.ipynb
Thanks

@ahmedoumar
Copy link
Author

when you run that notebook, you get only the target ids, as shown in the screenshot.

@elmadany
Copy link
Member

elmadany commented Jul 7, 2022

Thanks Ahmed, we will check this soon

@kabapy
Copy link

kabapy commented Sep 11, 2022

quick fix
result = torj.tokenizer.batch_decode(target, skip_special_tokens=True)

@mohammad-albarham mohammad-albarham linked a pull request Apr 9, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants