Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor DLA #432

Merged
merged 1 commit into from
Apr 14, 2022
Merged

Refactor DLA #432

merged 1 commit into from
Apr 14, 2022

Conversation

BobLd
Copy link
Collaborator

@BobLd BobLd commented Mar 12, 2022

Make DlaOptions an interface, add IWordExtractorOptions, remove GetBlocks(words, options), GetWords(letters, options) and put options in constructors - Fix #424. Tidy up code

OPen for discussion and feedback/ideas. Introduces breaking changes

…ocks(words, options), GetWords(letters, options) and put options in constructors - Fix UglyToad#424. Tidy up code
@EliotJones
Copy link
Member

Looks good to me, sorry for the long delay but would you be able to add a quick text summary of any compatibility changes for release notes for the next version. As I understand it it's something like:

someClass.Instance.GetWords(input, options);

Becomes:

new someClass(options).GetWords(input);

@EliotJones EliotJones merged commit 1068029 into UglyToad:master Apr 14, 2022
@BobLd
Copy link
Collaborator Author

BobLd commented Apr 23, 2022

@EliotJones Please find below what I would add in the notes. Also, I will update the wiki once the new version is there.


They are breaking changes in the latest version's document layout analysis tools. The options are now passed in the constructor.

IPageSegmenter

var blocks = DocstrumBoundingBoxes.Instance.GetBlocks(words, options);

or

var pageSegmenter = new DocstrumBoundingBoxes();
var blocks = pageSegmenter.GetBlocks(words, options);

becomes

var pageSegmenter = new DocstrumBoundingBoxes(options);
var blocks = pageSegmenter.GetBlocks(words);

The same applies for RecursiveXYCut and DocstrumBoundingBoxes

IWordExtractor

new NearestNeighbourWordExtractor().GetWords(letters, options)

becomes

new NearestNeighbourWordExtractor(options).GetWords(letters)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for PageSegmenterOptions with PageXmlTextExporter
2 participants