Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix exception in textMatchScore() when text is empty #3538

Merged
merged 4 commits into from
Jun 24, 2021

Conversation

mickael-menu
Copy link
Contributor

I observed an exception thrown from textMatchScore() if text is empty, because matches[0].errors is undefined in this case:

return 1 - (matches[0].errors / str.length);

An empty text occurs when the length of the context.prefix is larger than the prefix of the first match in the page. It gives a negative first parameter for this slice():

text.slice(match.start - context.prefix.length, match.start),

@codecov
Copy link

codecov bot commented Jun 23, 2021

Codecov Report

Merging #3538 (b0d0165) into master (da3a346) will not change coverage.
The diff coverage is n/a.

❗ Current head b0d0165 differs from pull request most recent head 895dd5e. Consider uploading reports for the commit 895dd5e to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3538   +/-   ##
=======================================
  Coverage   98.60%   98.60%           
=======================================
  Files         211      211           
  Lines        7697     7697           
  Branches     1750     1750           
=======================================
  Hits         7590     7590           
  Misses        107      107           
Impacted Files Coverage Δ
src/annotator/anchoring/match-quote.js 100.00% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update da3a346...895dd5e. Read the comment docs.

@robertknight
Copy link
Member

Hello,

Thanks for the PR. Would you mind adding a unit test for matchQuote that reproduces the original issue (when context.prefix.length exceeds text.length).

@mickael-menu
Copy link
Contributor Author

@robertknight Sure, here's the test run from master:

SUMMARY:
✔ 2833 tests completed
✖ 1 test failed

FAILED TESTS:
  matchQuote
    ✖ matches with a context prefix longer than the text
      Chrome Headless 92.0.4512.0 (Mac OS 10.15.7)
    TypeError: Cannot read property 'errors' of undefined
        at textMatchScore (/var/folders/ks/ny2nfkh5183ffwm10fw_9rq80000gn/T/annotator/anchoring/match-quote.js:63:26 <- /var/folders/ks/ny2nfkh5183ffwm10fw_9rq80000gn/T/b2022ad174868b7c5ac8d9da8c2e4ae8.browserify.js:165247:25)
        at scoreMatch (/var/folders/ks/ny2nfkh5183ffwm10fw_9rq80000gn/T/annotator/anchoring/match-quote.js:118:9 <- /var/folders/ks/ny2nfkh5183ffwm10fw_9rq80000gn/T/b2022ad174868b7c5ac8d9da8c2e4ae8.browserify.js:165319:96)
        at /var/folders/ks/ny2nfkh5183ffwm10fw_9rq80000gn/T/annotator/anchoring/match-quote.js:152:12 <- /var/folders/ks/ny2nfkh5183ffwm10fw_9rq80000gn/T/b2022ad174868b7c5ac8d9da8c2e4ae8.browserify.js:165348:14
        at Array.map (<anonymous>)
        at matchQuote (/var/folders/ks/ny2nfkh5183ffwm10fw_9rq80000gn/T/annotator/anchoring/match-quote.js:149:33 <- /var/folders/ks/ny2nfkh5183ffwm10fw_9rq80000gn/T/b2022ad174868b7c5ac8d9da8c2e4ae8.browserify.js:165342:59)
        at Context.<anonymous> (/var/folders/ks/ny2nfkh5183ffwm10fw_9rq80000gn/T/annotator/anchoring/test/match-quote-test.js:200:19 <- /var/folders/ks/ny2nfkh5183ffwm10fw_9rq80000gn/T/b2022ad174868b7c5ac8d9da8c2e4ae8.browserify.js:171460:46)

TOTAL: 1 FAILED, 2833 SUCCESS

mickael-menu added a commit to readium/r2-navigator-swift that referenced this pull request Jun 23, 2021
…mpty

This could happen in the real application if text near the end of the
document was annotated and an update to the document removes the text
between the end of the annotated text and the end of the document.

I also added a note about the preconditions for when the `search`
function is guaranteed to return at least one match.
Copy link
Member

@robertknight robertknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I made a couple of additional changes on this branch:

  • I revised the description of the prefix test to better describe the condition when text.slice(...) was producing an empty output before
  • I added an additional test case that covers a related issue that could happen when generating a score for the suffix match. The existing changes in textMatchScore already fixed the problem.

@robertknight robertknight merged commit 6cae566 into hypothesis:master Jun 24, 2021
@robertknight
Copy link
Member

These changes have been shipped in client version v1.809.0.

@mickael-menu mickael-menu deleted the fix/textMatchScore branch June 24, 2021 12:32
@mickael-menu
Copy link
Contributor Author

Sounds good, thanks for reviewing my PR 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants