You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for reading this issue. When I'm already in docker and run "CUDA_VISIBLE_DEVICES="2" python3 seq2seq/eval_run_seq2seq.py configs/cosql/eval_cosql_rasat_576.json", I got this error.
Truncation was not explicitly activated but max_length is provided a specific value, please use truncation=True to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to truncation.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 6.38ba/s]
01/07/2024 08:31:51 - WARNING - stanza - Can not find mwt: default from official model list. Ignoring it.
Traceback (most recent call last):
File "seq2seq/eval_run_seq2seq.py", line 310, in
main()
File "seq2seq/eval_run_seq2seq.py", line 177, in main
tokenizer=tokenizer,
File "/app/seq2seq/utils/dataset_loader.py", line 123, in load_dataset
**_prepare_splits_kwargs,
File "/app/seq2seq/utils/dataset.py", line 360, in prepare_splits
pre_process_function=pre_process_function,
File "/app/seq2seq/utils/dataset.py", line 324, in _prepare_eval_split
use_dependency=data_training_args.use_dependency
File "/app/seq2seq/preprocess/choose_dataset.py", line 12, in preprocess_by_dataset
preprocessing_generate_lgerels(data_base_dir, dataset_name, mode, use_coref, use_dependency)
File "/app/seq2seq/preprocess/process_dataset.py", line 81, in preprocessing_generate_lgerels
processor = Preprocessor(dataset_name, db_dir=db_dir, db_content=True)
File "/app/seq2seq/preprocess/common_utils.py", line 146, in init
self.nlp_tokenize = stanza.Pipeline('en', processors='tokenize,mwt,pos,lemma,depparse', tokenize_pretokenized = False, use_gpu=True)#, use_gpu=False)
File "/home/toolkit/.local/lib/python3.7/site-packages/stanza/pipeline/core.py", line 107, in init
self.load_list = add_dependencies(resources, lang, self.load_list) if lang in resources else []
File "/home/toolkit/.local/lib/python3.7/site-packages/stanza/resources/common.py", line 245, in add_dependencies
default_dependencies = resources[lang]['default_dependencies']
KeyError: 'default_dependencies'
Thanks for any solution. That will be really important for me.
The text was updated successfully, but these errors were encountered:
Thanks for reading this issue. When I'm already in docker and run "CUDA_VISIBLE_DEVICES="2" python3 seq2seq/eval_run_seq2seq.py configs/cosql/eval_cosql_rasat_576.json", I got this error.
Truncation was not explicitly activated but
max_length
is provided a specific value, please usetruncation=True
to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy totruncation
.100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 6.38ba/s]
01/07/2024 08:31:51 - WARNING - stanza - Can not find mwt: default from official model list. Ignoring it.
Traceback (most recent call last):
File "seq2seq/eval_run_seq2seq.py", line 310, in
main()
File "seq2seq/eval_run_seq2seq.py", line 177, in main
tokenizer=tokenizer,
File "/app/seq2seq/utils/dataset_loader.py", line 123, in load_dataset
**_prepare_splits_kwargs,
File "/app/seq2seq/utils/dataset.py", line 360, in prepare_splits
pre_process_function=pre_process_function,
File "/app/seq2seq/utils/dataset.py", line 324, in _prepare_eval_split
use_dependency=data_training_args.use_dependency
File "/app/seq2seq/preprocess/choose_dataset.py", line 12, in preprocess_by_dataset
preprocessing_generate_lgerels(data_base_dir, dataset_name, mode, use_coref, use_dependency)
File "/app/seq2seq/preprocess/process_dataset.py", line 81, in preprocessing_generate_lgerels
processor = Preprocessor(dataset_name, db_dir=db_dir, db_content=True)
File "/app/seq2seq/preprocess/common_utils.py", line 146, in init
self.nlp_tokenize = stanza.Pipeline('en', processors='tokenize,mwt,pos,lemma,depparse', tokenize_pretokenized = False, use_gpu=True)#, use_gpu=False)
File "/home/toolkit/.local/lib/python3.7/site-packages/stanza/pipeline/core.py", line 107, in init
self.load_list = add_dependencies(resources, lang, self.load_list) if lang in resources else []
File "/home/toolkit/.local/lib/python3.7/site-packages/stanza/resources/common.py", line 245, in add_dependencies
default_dependencies = resources[lang]['default_dependencies']
KeyError: 'default_dependencies'
Thanks for any solution. That will be really important for me.
The text was updated successfully, but these errors were encountered: