Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval process issue #25

Open
fengxin-zhxx opened this issue Jan 7, 2024 · 0 comments
Open

Eval process issue #25

fengxin-zhxx opened this issue Jan 7, 2024 · 0 comments

Comments

@fengxin-zhxx
Copy link

Thanks for reading this issue. When I'm already in docker and run "CUDA_VISIBLE_DEVICES="2" python3 seq2seq/eval_run_seq2seq.py configs/cosql/eval_cosql_rasat_576.json", I got this error.

Truncation was not explicitly activated but max_length is provided a specific value, please use truncation=True to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to truncation.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 6.38ba/s]
01/07/2024 08:31:51 - WARNING - stanza - Can not find mwt: default from official model list. Ignoring it.
Traceback (most recent call last):
File "seq2seq/eval_run_seq2seq.py", line 310, in
main()
File "seq2seq/eval_run_seq2seq.py", line 177, in main
tokenizer=tokenizer,
File "/app/seq2seq/utils/dataset_loader.py", line 123, in load_dataset
**_prepare_splits_kwargs,
File "/app/seq2seq/utils/dataset.py", line 360, in prepare_splits
pre_process_function=pre_process_function,
File "/app/seq2seq/utils/dataset.py", line 324, in _prepare_eval_split
use_dependency=data_training_args.use_dependency
File "/app/seq2seq/preprocess/choose_dataset.py", line 12, in preprocess_by_dataset
preprocessing_generate_lgerels(data_base_dir, dataset_name, mode, use_coref, use_dependency)
File "/app/seq2seq/preprocess/process_dataset.py", line 81, in preprocessing_generate_lgerels
processor = Preprocessor(dataset_name, db_dir=db_dir, db_content=True)
File "/app/seq2seq/preprocess/common_utils.py", line 146, in init
self.nlp_tokenize = stanza.Pipeline('en', processors='tokenize,mwt,pos,lemma,depparse', tokenize_pretokenized = False, use_gpu=True)#, use_gpu=False)
File "/home/toolkit/.local/lib/python3.7/site-packages/stanza/pipeline/core.py", line 107, in init
self.load_list = add_dependencies(resources, lang, self.load_list) if lang in resources else []
File "/home/toolkit/.local/lib/python3.7/site-packages/stanza/resources/common.py", line 245, in add_dependencies
default_dependencies = resources[lang]['default_dependencies']
KeyError: 'default_dependencies'

Thanks for any solution. That will be really important for me.

@fengxin-zhxx fengxin-zhxx changed the title Eval peocess issue Eval process issue Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant