Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_in_stoplist should return True for entities trimmed out of existence #12

Closed
dan2097 opened this issue Jan 14, 2017 · 1 comment
Closed
Labels

Comments

@dan2097
Copy link

dan2097 commented Jan 14, 2017

In an entity like "-aromatic" which is in IGNORE_SUFFIX the resultant entity after running _in_stoplist is of length 0, hence the entity should be ignored (i.e. the function should return True) rather than reporting a 0 length entity.

On an entity which is both in IGNORE_PREFIX and IGNORE_SUFFIX you can get into a situation where the end index is actually before the start end index!

d = Document("non-aromatic")
d.cems
[Span(u'', 4, 3)]

I assume adding this check that the resultant entity's length is > 0 will fix that case as well.

@mcs07 mcs07 closed this as completed in 3a7bc53 Jan 22, 2017
@mcs07
Copy link
Owner

mcs07 commented Jan 22, 2017

Oops! Nice catch, thanks.

@mcs07 mcs07 added the bug label Feb 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants