Release NLU 5.4.1 Release · JohnSnowLabs/nlu

Few-Shot Assertion Classifier

FewShotAssertionClassifier Model is an advanced annotator designed to get higher accuracy with fewer data samples inspired by the SetFit framework. Few-Shot Assertion models consist of a sentence embedding component paired with a classifier (or head). While current support is focused on MPNet-based Few-Shot Assertion models, future updates will extend compatibility to include other popular models like Bert, DistillBert, and Roberta.
This classifier model supports various classifier types, including sklearn’s LogisticRegression and custom PyTorch models, providing flexibility for different model setups.

Language	nlp.load() reference	Spark NLP Model reference
en	en.few_assert_shot_classifier	assertion_fewshotclassifier

Partitioning Spark-DFs

Support for configuring partitioning of Spark-DFs via pipe.predict(data, partitioning=1000)
In Spark ML pipelines, which are the backbone of NLU, effective partitioning optimizes parallelism, reduces shuffling and ensuring even data distribution, which is crucial for high-performance machine learning tasks.

Bugfixes

Fixed bug causing DB endpoint environments to fail predicting on data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NLU 5.4.1 Release

Few-Shot Assertion Classifier

Partitioning Spark-DFs

Bugfixes