Tokenizer object has no attribute 'tokenizer' #7

Gusreis7 · 2023-03-23T15:47:25Z

Hi thanks for your project !
I've been trying to use your work to punctuate some audios in portuguese, but I got stuck with some problems with the Tokenizer

First I got in punctuate.py:
line 84, in init self.tokenizer = self.whisper_tokenizer.tokenizer AttributeError: 'Tokenizer' object has no attribute 'tokenizer'

By removing the .tokenizer, I got another error in punctuate.py:

line 221 tokenizer has no convert ids tokenizer.convert_ids_to_tokens

Do you have any ideia why this is happening?

jumon · 2023-04-01T10:12:13Z

Thank you for trying out this project!

The issue you are experiencing is due to a recent change in whisper (openai/whisper#1044), which has replaced Hugging Face's tokenizer with TikToken. I will modify this repository to ensure compatibility with the latest version of Whisper.

In the meantime, as a workaround, you can use the older version of Whisper by running the following command:

pip install openai-whisper==20230308

Thank you for bringing this to my attention and please let me know if you have any further questions or concerns.

Gusreis7 closed this as completed Mar 1, 2024

IceCreamWW mentioned this issue May 28, 2024

Whisper attribute error tokenizer object has no attribute tokenizer espnet/espnet#5792

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tokenizer object has no attribute 'tokenizer' #7

Tokenizer object has no attribute 'tokenizer' #7

Gusreis7 commented Mar 23, 2023

jumon commented Apr 1, 2023 •

edited

Loading

Tokenizer object has no attribute 'tokenizer' #7

Tokenizer object has no attribute 'tokenizer' #7

Comments

Gusreis7 commented Mar 23, 2023

jumon commented Apr 1, 2023 • edited Loading

jumon commented Apr 1, 2023 •

edited

Loading