-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Customized tokenizer reports 'AssertionError' with specific infixes #1292
Comments
I have found a solution for this bug, I'd like to push a pull request.
|
Thanks, this looks good! |
honnibal
added a commit
that referenced
this issue
Sep 4, 2017
Fix issue #1292 and add test case for the Assertion Error
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
When I trying to add '—-' to the infix_re, whole re is
'[\[\]!&:,()\*—\-]'
, for a custom tokenizer, I got the following error.I am using the head of spacy, but same error on an earlier version. Only '—-' is causing this problem.
After some hacking, I find a suspect for the error, trailing '-' in the string
Your Environment
The text was updated successfully, but these errors were encountered: