Few-Shot Assertion Classifier
FewShotAssertionClassifier Model is an advanced annotator designed to get higher accuracy with fewer data samples inspired by the SetFit framework. Few-Shot Assertion models consist of a sentence embedding component paired with a classifier (or head). While current support is focused on MPNet-based Few-Shot Assertion models, future updates will extend compatibility to include other popular models like Bert, DistillBert, and Roberta.
This classifier model supports various classifier types, including sklearn’s LogisticRegression and custom PyTorch models, providing flexibility for different model setups.
Powered by FewShotAssertionClassifier
Language | nlp.load() reference | Spark NLP Model reference |
---|---|---|
en | en.few_assert_shot_classifier | assertion_fewshotclassifier |
Partitioning Spark-DFs
Support for configuring partitioning of Spark-DFs via pipe.predict(data, partitioning=1000)
In Spark ML pipelines, which are the backbone of NLU, effective partitioning optimizes parallelism, reduces shuffling and ensuring even data distribution, which is crucial for high-performance machine learning tasks.
Bugfixes
- Fixed bug causing DB endpoint environments to fail predicting on data