Skip to content

v0.0.6

Compare
Choose a tag to compare
@Halvani Halvani released this 24 Mar 00:29
· 21 commits to main since this release

What's new?

  • The structure of the constituent tree can be modified. By default, inner postag nodes and token leaves are present (Structure.Complete). Alternatively, postag nodes or token leaves can be removed. In the case of the latter, postag sequences result from the extracted phrases.
  • Ensured that there are no multiple spaces at the end of a sentence that cause an exception regarding benepar when the sentence is parsed.
  • Create_pipeline() downloads the benepar model to the path "share\nltk_data\models" so that no remaining data is left behind in the CTL directory when CTL is uninstalled.
  • Create_pipeline() is supplied with a 'quite' parameter to suppress pip installation output.
  • Integrated optional expansion of contractions (e.g., I'm --> I am) within sentences. Note that this is only supported for English.
  • Incorporation of comprehensive error handling (e.g., validating language mismatch between the given sentence and the benepar and spaCy models). Integrated custom exceptions that simplify the debugging process.
  • Extensive code refactoring (e.g., reduction of code repetitions, conversion of all string literals from ' to ", etc.)