TextKNNClassifier

TextKNNClassifier is a k-nearest neighbors classifier for text data. It uses a compression algorithm to compute the distance between texts and predicts the label of a test entry based on the labels of the k-nearest neighbors in the training data.

Installation

You can install TextKNNassifier using pip:

pip install textknnassifier

Usage

Here's an example of how to use TextKNNClassifier:

from textknnassifier import classifier

training_data = [
    "This is a test",
    "Another test",
    "General Tarkin",
    "General Grievous",
]
training_labels = ["test", "test", "star_wars", "star_wars"]
testing_data = [
    "This is a test",
    "Testing here too!",
    "General Kenobi",
    "General Skywalker",
]

KNN = classifier.TextKNNClassifier(n_neighbors=2)
KNN.fit(training_data, training_labels)
predicted_labels = KNN.predict(testing_data)

print(predicted_labels)
# Output: ['test', 'test', 'star_wars', 'star_wars']

In this example, we create a TextKNNClassifier instance and use it to predict the labels of the test entries. The initialization is given n_neighbors=2, this denotes the number of training datapoints to consider for predicting the testing label. The fit method takes two arguments: the training data, and the training labels. It simply stores these values for later use. The predict method takes the testing data as an argument and returns the predicted labels.

References

Jiang, Z., Yang, M., Tsirlin, M., Tang, R., Dai, Y., & Lin, J. (2023, July). “Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors. In Findings of the Association for Computational Linguistics: ACL 2023 (pp. 6810-6828).

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
docs/pdoc-theme		docs/pdoc-theme
src/textknnassifier		src/textknnassifier
tests/unit		tests/unit
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TextKNNClassifier

Installation

Usage

References

About

Releases

Packages

Languages

License

childmindresearch/text-knnassifier

Folders and files

Latest commit

History

Repository files navigation

TextKNNClassifier

Installation

Usage

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages