Skip to content

NLU 5.4.1 Release

Latest
Compare
Choose a tag to compare
@C-K-Loan C-K-Loan released this 24 Oct 16:12

Few-Shot Assertion Classifier

FewShotAssertionClassifier Model is an advanced annotator designed to get higher accuracy with fewer data samples inspired by the SetFit framework. Few-Shot Assertion models consist of a sentence embedding component paired with a classifier (or head). While current support is focused on MPNet-based Few-Shot Assertion models, future updates will extend compatibility to include other popular models like Bert, DistillBert, and Roberta.
This classifier model supports various classifier types, including sklearn’s LogisticRegression and custom PyTorch models, providing flexibility for different model setups.

Powered by FewShotAssertionClassifier

Language nlp.load() reference Spark NLP Model reference
en en.few_assert_shot_classifier assertion_fewshotclassifier

Partitioning Spark-DFs

Support for configuring partitioning of Spark-DFs via pipe.predict(data, partitioning=1000)
In Spark ML pipelines, which are the backbone of NLU, effective partitioning optimizes parallelism, reduces shuffling and ensuring even data distribution, which is crucial for high-performance machine learning tasks.

Bugfixes

  • Fixed bug causing DB endpoint environments to fail predicting on data