Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lasso.jl for big-ish data #9

Open
CorySimon opened this issue Mar 3, 2017 · 2 comments
Open

Lasso.jl for big-ish data #9

CorySimon opened this issue Mar 3, 2017 · 2 comments

Comments

@CorySimon
Copy link

CorySimon commented Mar 3, 2017

My entire design matrix cannot fit in memory.
Much like SGDRegressor.partial_fit() in scikit-learn (see here), can I use Lasso.jl to fit in epochs, feeding batches of data at a time? I realize that this will likely not converge to the same parameters as if the data could all fit in memory.

Maybe one way to train in batches would be to modify criterion in fit() to stop after a certain number of iterations?

@rakeshvar
Copy link

Did you solve your problem?
I do not know if this helps, as it has been quite sometime.
But you can parallelize using ADMM.
You refit Lasso on parts of the data iteratively.

@baggepinnen
Copy link

baggepinnen commented Feb 19, 2020

Here's an implementation of ADMM to get you started in case you still need.
https://github.com/baggepinnen/LPVSpectral.jl/blob/724561469a483aa1ffae6fa76b73c67ed2becce7/src/lasso.jl#L118

The functions above specify the prox operators that are inputs to ADMM to solve the LASSO problem

Used like this

using LPVSpectral, ProximalOperators
A = randn(70,100);
x = randn(100) .* (rand(100) .< 0.05);
y = A*x;
proxF = LeastSquares(A,y)
xh,zh = LPVSpectral.ADMM(randn(100), proxF, NormL1(3))
[x zh]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants