Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose improvementTol inside FirstOrderMinimizer #373

Merged
merged 1 commit into from
Mar 2, 2015

Conversation

coderxiang
Copy link
Contributor

When I run the LBFGS implementation of Spark on a moderate size proprietary data set (~ 5M samples), I see the resultant vector diverges from some standard solver (e.g., liblinear, sci-learn) on around half of the coefficients. The computation usually stops around 20 iterations although I set the iteration number to be 100 with a very strict tolerance variable (1e-15). The formulation is strongly convex so the solution should be unique. Configing the improvementTol variable inside FirstOrderMinimizer.scala can solve this issue.

Two ways of doing this:

  1. Exposing the improvementTol variable to LBFGS, as is done in the PR
  2. Change the default value of improvementTol in FirstOrderMinimizer.scala

I chose the first one here.

@dlwh
Copy link
Member

dlwh commented Mar 2, 2015

thanks. Yeah, I need to figure out the convergence checks.

dlwh added a commit that referenced this pull request Mar 2, 2015
expose improvementTol inside FirstOrderMinimizer
@dlwh dlwh merged commit 7d42e37 into scalanlp:master Mar 2, 2015
@coderxiang coderxiang deleted the lbfgs-tol branch March 2, 2015 22:11
@dbtsai
Copy link
Contributor

dbtsai commented Mar 3, 2015

@dlwh, are you going to publish this so I can upgrade Spark 1.3 dependency before it's released? thanks.

@dlwh
Copy link
Member

dlwh commented Mar 3, 2015

sure. what's the timeline?

On Mon, Mar 2, 2015 at 6:17 PM, DB Tsai notifications@github.com wrote:

@dlwh https://github.com/dlwh, are you going to publish this so I can
upgrade Spark 1.3 dependency before it's released? thanks.


Reply to this email directly or view it on GitHub
#373 (comment).

@dbtsai
Copy link
Contributor

dbtsai commented Mar 3, 2015

Is OWLQN affected by this issue? Spark 1.3 is RC2, so if we want to change, we need to be fast. Thanks.

@dlwh
Copy link
Member

dlwh commented Mar 3, 2015

ok, Debasish has been working on his new optimization algorithm. I imagine
that that's post Spark 1.3 in any case?

On Mon, Mar 2, 2015 at 6:49 PM, DB Tsai notifications@github.com wrote:

Is OWLQN affected by this issue? Spark 1.3 is RC2, so if we want to
change, we need to be fast. Thanks.


Reply to this email directly or view it on GitHub
#373 (comment).

@dlwh
Copy link
Member

dlwh commented Mar 3, 2015

owlqn's convergence conditions are also fixed by this.

On Mon, Mar 2, 2015 at 7:17 PM, David Hall david.lw.hall@gmail.com wrote:

ok, Debasish has been working on his new optimization algorithm. I imagine
that that's post Spark 1.3 in any case?

On Mon, Mar 2, 2015 at 6:49 PM, DB Tsai notifications@github.com wrote:

Is OWLQN affected by this issue? Spark 1.3 is RC2, so if we want to
change, we need to be fast. Thanks.


Reply to this email directly or view it on GitHub
#373 (comment).

@dbtsai
Copy link
Contributor

dbtsai commented Mar 3, 2015

Sounds great. Then let's publish the breeze with this fix, and I'll open another PR to Spark.

@dlwh
Copy link
Member

dlwh commented Mar 3, 2015

0.11 is released. Will take a bit to propagate to maven central.

asfgit pushed a commit to apache/spark that referenced this pull request Mar 4, 2015
…ce bug

LBFGS and OWLQN in Breeze 0.10 has convergence check bug.
This is fixed in 0.11, see the description in Breeze project for detail:

scalanlp/breeze#373 (comment)

Author: Xiangrui Meng <meng@databricks.com>
Author: DB Tsai <dbtsai@alpinenow.com>
Author: DB Tsai <dbtsai@dbtsai.com>

Closes #4879 from dbtsai/breeze and squashes the following commits:

d848f65 [DB Tsai] Merge pull request #1 from mengxr/AlpineNow-breeze
c2ca6ac [Xiangrui Meng] upgrade to breeze-0.11.1
35c2f26 [Xiangrui Meng] fix LRSuite
397a208 [DB Tsai] upgrade breeze

(cherry picked from commit 76e20a0)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
asfgit pushed a commit to apache/spark that referenced this pull request Mar 4, 2015
…ce bug

LBFGS and OWLQN in Breeze 0.10 has convergence check bug.
This is fixed in 0.11, see the description in Breeze project for detail:

scalanlp/breeze#373 (comment)

Author: Xiangrui Meng <meng@databricks.com>
Author: DB Tsai <dbtsai@alpinenow.com>
Author: DB Tsai <dbtsai@dbtsai.com>

Closes #4879 from dbtsai/breeze and squashes the following commits:

d848f65 [DB Tsai] Merge pull request #1 from mengxr/AlpineNow-breeze
c2ca6ac [Xiangrui Meng] upgrade to breeze-0.11.1
35c2f26 [Xiangrui Meng] fix LRSuite
397a208 [DB Tsai] upgrade breeze
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants