Skip to content

Commit

Permalink
[SPARK-1260]: faster construction of features with intercept
Browse files Browse the repository at this point in the history
The current implementation uses `Array(1.0, features: _*)` to construct a new array with intercept. This is not efficient for big arrays because `Array.apply` uses a for loop that iterates over the arguments. `Array.+:` is a better choice here.

Also, I don't see a reason to set initial weights to ones. So I set them to zeros.

JIRA: https://spark-project.atlassian.net/browse/SPARK-1260

Author: Xiangrui Meng <meng@databricks.com>

Closes #161 from mengxr/sgd and squashes the following commits:

b5cfc53 [Xiangrui Meng] set default weights to zeros
a1439c2 [Xiangrui Meng] faster construction of features with intercept
  • Loading branch information
mengxr authored and rxin committed Mar 18, 2014
1 parent 79e547f commit e108b9a
Showing 1 changed file with 4 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ abstract class GeneralizedLinearAlgorithm[M <: GeneralizedLinearModel]
*/
def run(input: RDD[LabeledPoint]) : M = {
val nfeatures: Int = input.first().features.length
val initialWeights = Array.fill(nfeatures)(1.0)
val initialWeights = new Array[Double](nfeatures)
run(input, initialWeights)
}

Expand All @@ -134,15 +134,15 @@ abstract class GeneralizedLinearAlgorithm[M <: GeneralizedLinearModel]
throw new SparkException("Input validation failed.")
}

// Add a extra variable consisting of all 1.0's for the intercept.
// Prepend an extra variable consisting of all 1.0's for the intercept.
val data = if (addIntercept) {
input.map(labeledPoint => (labeledPoint.label, Array(1.0, labeledPoint.features:_*)))
input.map(labeledPoint => (labeledPoint.label, labeledPoint.features.+:(1.0)))
} else {
input.map(labeledPoint => (labeledPoint.label, labeledPoint.features))
}

val initialWeightsWithIntercept = if (addIntercept) {
Array(1.0, initialWeights:_*)
initialWeights.+:(1.0)
} else {
initialWeights
}
Expand Down

0 comments on commit e108b9a

Please sign in to comment.