This package provides PyTorch implementations to solve the group elastic net problem. Let Aj (j = 1 … p) be feature matrices of sizes m × nj (m is the number of samples, and nj is the number of features in the jth group), and let y be an m × 1 vector of the responses. Group elastic net finds coefficients βj, and a bias β0 that solve the optimization problem
min β0, …, βp ½ ║y - β0 - ∑ Aj βj║2 + m ∑ √nj (λ1║βj║ + λ2║βj║2).
Here λ1 and λ2 are scalar coefficients that control the amount of 2-norm and squared 2-norm regularization. This 2-norm regularization encourages sparsity at the group level; entire βj might become 0. The squared 2-norm regularization is in similar spirit to elastic net, and addresses some of the issues of lasso. Note that group elastic net includes as special cases group lasso (λ2 = 0), ridge regression (λ1 = 0), elastic net (each nj = 1), and lasso (each nj = 1 and λ2 = 0). The optimization problem is convex, and can be solved efficiently. This package provides two implementations; one based on proximal gradient descent, and one based on coordinate descent.
Install with pip
pip install torchgel
tqdm
(for progress bars), and numpy are pulled in as dependencies. PyTorch
(v1.0+
) is also needed, and needs to be installed manually. Refer to the
PyTorch website for instructions.
examples/main.ipynb
is a Jupyter notebook that walks
through using the package for a typical use-case. A more formal description of
the functions follows; and for details about the algorithms, refer to the
docstrings of files in the gel
directory.
The modules gel.gelfista
and gel.gelcd
provide implementations based on
proximal gradient descent and coordinate descent respectively. Both have similar
interfaces, and expose two main public functions: make_A
and gel_solve
. The
feature matrices should be stored in a list (say As
) as PyTorch tensor
matrices, and the responses should be stored in a PyTorch vector (say y
).
Additionally, the sizes of the groups (nj) should be stored in a
vector (say ns
). First use the make_A
function to convert the feature
matrices into a suitable format:
A = make_A(As, ns)
Then pass A
, y
and other required arguments to gel_solve
. The general
interface is::
b_0, B = gel_solve(A, y, l_1, l_2, ns, **kwargs)
l_1
and l_2
are floats representing λ1 and λ2
respectively. The method returns a float b_0
representing the bias and a
PyTorch matrix B
holding the other coefficients. B
has size p ×
maxj nj with suitable zero padding. The following
sections cover additional details for the specific implementations.
The gel.gelfista
module contains a proximal gradient descent implementation.
It's usage is just as described in the template above. Refer to the docstring
for gel.gelfista.gel_solve
for details about the other arguments.
The gel.gelcd
module contains a coordinate descent implementation. Its usage
is a bit more involved than the FISTA implementation. Coordinate descent
iteratively solves single blocks (each corresponding to a single
βj). There are multiple solvers provided to solve the individual
blocks. These are the gel.gelcd.block_solve_*
functions. Refer to their
docstrings for details about their arguments. gel.gelcd.gel_solve
requires
passing a block solve function and its arguments (as a dictionary). Refer to
its docstring for further details.
gel.gelpaths
provides a wrapper function gel_paths
to solve the group
elastic net problem for multiple values of the regularization coefficients. It
implements a two-stage process. For a given λ1 and λ2,
first the group elastic net problem is solved and the feature blocks with
non-zero coefficients is extracted (the support). Then ridge regression models
are learned for each of several provided regularization values. The final model
is summarized using an arbitrary provided summary function, and the summary for
each combination of the regularization values is returned as a dictionary. The
docstring contains more details. gel.ridgepaths
contains another useful function,
ridge_paths
which can efficiently solve ridge regression for multiple
regularization values.
If you find this code useful in your research, please cite
@misc{koushik2017torchgel,
author = {Koushik, Jayanth},
title = {torch-gel},
year = {2017},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/jayanthkoushik/torch-gel}},
}