Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix to pseudocode for spaCy tokenizer #2604

Merged
merged 1 commit into from
Jul 27, 2018
Merged

Conversation

giannisdaras
Copy link
Contributor

Description

Types of change

This pull request is a bug fix to the pseudocode provided in the website of spaCy docs for implementing/customizing spacy tokenizer.
The pseudocode provided here slices wrongly the substring when a suffix is spotted. In order to make it write, the following lines of code

suffixes.append(substring[split:])
substring = substring[:split]

should be changed to:

suffixes.append(substring[-split:])
substring = substring[:-split]

Checklist

  • I have submitted the spaCy Contributor Agreement.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@ines ines added the docs Documentation and website label Jul 27, 2018
@ines
Copy link
Member

ines commented Jul 27, 2018

Ah, you're right – thanks a lot! 👍

@ines ines merged commit 055cc0d into explosion:master Jul 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation and website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants