-
Notifications
You must be signed in to change notification settings - Fork 45
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Wrapper::wrap: return Vec<Cow<str>> instead of Vec<String>
This completely reworks the wrapping algorithm. Before it copied the input string word-by-word into the output vector and would therefore always end up allocating new strings. It now reuses the input string whenever possible and will only allocate new strings when the word splitter forces it to, i.e., when the wordsplitter adds extra hyphens that did not appear in the input string. The NoHyphenation and HyphenSplitter (the default word splitter) do not add extra hyphens and will thus never allocate. When using the Corpus word splitter, only lines with hyphenated words will allocate extra memory. The new algorithm is 15-25% faster than the previous.
- Loading branch information
Showing
2 changed files
with
113 additions
and
152 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters