Skip to content

Commit

Permalink
added description/usage for k-means
Browse files Browse the repository at this point in the history
  • Loading branch information
omarx5 committed Sep 26, 2011
1 parent 12b6f0e commit 0adda5a
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions README
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
maplearn is a library of classifier/regression tools to be used with the Hadoop Streaming utility in Apache Hadoop.
"maplearn" is a library of classifier/regression tools to be used with the Hadoop Streaming utility in Apache Hadoop.
Note: All scripts take space-delimited files as input.

linear
Expand All @@ -10,5 +10,5 @@ Description: Naive bayes classifier assumes a non-singular, linearly-independent
Usage: Reducer returns a-priori (such as - P(y = yk)) and conditional (such as - P(xi = a | y = yk)) probabilities. Predictions can be made by applying Baye's law and the naive assumptions made, P(Y) = P(Y = yk) * P(X = xi | Y = yk)....P(X = xn | Y = yk). Maximization over the the posterior probabilities reveals the most likely outcome.

kmeans
Description:
Usage:
Description: The k-means clustering algorithm iteratively minimizes the sum of squares for K cluster centroids. Parallelization is achieved by independently clustering each subgroup and using the partial sums to calculate the weighted averages.
Usage: Reducer returns centroid features for each K cluster. K and initial starting centers must be specified in mapper.R.

0 comments on commit 0adda5a

Please sign in to comment.