Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Colab Example #6

Open
msfasha opened this issue Apr 14, 2023 · 2 comments
Open

Error in Colab Example #6

msfasha opened this issue Apr 14, 2023 · 2 comments

Comments

@msfasha
Copy link

msfasha commented Apr 14, 2023

When running the following command, the error presented below is raised:

Beam search is the default generation method on Turjuman
!turjuman_translate --text "As US reaches one million COVID deaths, how are Americans coping?"

IndexError: too many indices for tensor of dimension 2

/usr/local/bin/turjuman_translate:8 in <module>                              │
│                                                                              │
│   5 from turjuman_cli.translate import translate_cli                         │
│   6 if __name__ == '__main__':                                               │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])     │
│ ❱ 8 │   sys.exit(translate_cli())                                            │
│   9                                                                          │
│                                                                              │
│ /usr/local/lib/python3.9/dist-packages/turjuman_cli/translate.py:76 in       │
│ translate_cli                                                                │
│                                                                              │
│   73 │                                                                       │
│   74 │   torj = turjuman(logger, args.cache_dir)                             │
│   75 │   if input_source=="text":                                            │
│ ❱ 76 │   │   torj.translate_from_text (args.text, args.search_method, args.s │
│   77 │   elif input_source=="file":                                          │
│   78 │   │   torj.translate_from_file (args.input_file, args.search_method,  │
│   79                                                                         │
│                                                                              │
│ /usr/local/lib/python3.9/dist-packages/turjuman/turjuman.py:93 in            │
│ translate_from_text                                                          │
│                                                                              │
│    90 │   │   outputs = self.translate(sources, search_method, seq_length, m │
│    91 │   │                                                                  │
│    92 │   │   if max_outputs==1:                                             │
│ ❱  93 │   │   │   targets = outputs['target'][0]                             │
│    94 │   │   else:                                                          │
│    95 │   │   │   targets = outputs[str(max_outputs)+'_targets'][0]          │
│    96 │   │   if type(targets) == list:     
@msfasha msfasha changed the title Cannot install locally Error in Colab Example Apr 14, 2023
@obada-jaras
Copy link

same

@obada-jaras
Copy link

Use this Colab example. Note that in this example, the output is the token ids. To solve this problem, you have to follow this comment.

Also, I faced another error and solved it using these two lines of code:

import torch
torch.set_default_tensor_type('torch.cuda.FloatTensor')

I think you have to use this only if you select the runtime to GPU.


Here's my final code that runs properly:

import torch
torch.set_default_tensor_type('torch.cuda.FloatTensor')

(2) Translate using beam search (default)

beam_options = {"search_method":"beam", "seq_length": 1024, "num_beams":5, "no_repeat_ngram_size":2, "max_outputs":3}
target = torj.translate("As US reaches one million COVID deaths, how are Americans coping?",**beam_options)
result = torj.tokenizer.batch_decode(target, skip_special_tokens=True)
print (result)

(3) Translate using greedy search

greedy_options = {"search_method":"greedy", "seq_length": 1024}
target = torj.translate("As US reaches one million COVID deaths, how are Americans coping?",**greedy_options)
result = torj.tokenizer.batch_decode(target, skip_special_tokens=True)
print (result)

(4) Translate using sampling search

sampling_options = {"search_method":"sampling", "seq_length": 1024, "max_outputs":3, "top_p":0.95, "top_k":50}
target = torj.translate("As US reaches one million COVID deaths, how are Americans coping?",**sampling_options)
result = torj.tokenizer.batch_decode(target, skip_special_tokens=True)
print (result)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants