Grape

Grape is a collection of document clustering algorithms written in Scala. It avails from Apache OpenNLP to extract specific feature from each document and build the final vector space that is used in different approaches. Grape contains the following algorithms (at the moment):

KMean Clustering
Hierarchical Agglomerative Clustering
Buckshot Clustering

How to use

An example how to use KMean clustering on your documents:

import com.jayway.textmining.{NLPFeatureSelection, Cluster, KMeanCluster}

// number of clusters
val k = ...

// A document is a pair of (Document ID, Document Content). ID can be anything.
val docs: List[(String, String)] = ...

val kMeanCluster = new KMeanCluster(docs, k) with NLPFeatureSelection
val clusters:List[Cluster] = kMeanCluster.doCluster()

License

Distributed under the Apache Software License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Grape

How to use

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Grape

How to use

License