# Algorithms (l1l2py.algorithms)¶

In order to describe the function implemented in this module, we have to assume some notation.

Assuming to have a centered data matrix and a column vector of regression values or binary labels , we want to minimize the regression/classification error solving a Regularized Least Square (RLS) problem.

In this module two main algorithms are implemented. The first one solves a classical RLS problem with a penalty on the of the vector (also called ridge_regression)

(1) with .

The second one minimizes a functional with a linear combination of two penalties on the and of the vector (also called l1l2_regularization)

(2) with and .

## Implementation details¶

While (1) has closed-form solution, for (2) there are many different approaches. In this module we provide an Iterative Shrinkage-Thresholding Algorithm (ISTA) proposed in [DeMol09a] exploiting a faster variation (called FISTA) proposed in [Beck09].

Starting from a null vector , each step updates the value of until convergence: where, is the soft-thresholding function The constant is a (theorically optimal) step size which depends on the data: where is the maximum eigenvalue of the matrix .

The convergence is reached when for each : where and before reaches a fixed maximum number of iterations.

## Regularization Algorithms¶

l1l2py.algorithms.ridge_regression(data, labels, mu=0.0)

Implementation of the Regularized Least Squares solver.

It solves the ridge regression problem with parameter mu on the l2-norm.

Parameters : data : (N, P) ndarray Data matrix. labels : (N,) or (N, 1) ndarray Labels vector. mu : float, optional (default is 0.0) l2-norm penalty. beta : (P, 1) ndarray Ridge regression solution.

Examples

>>> X = numpy.array([[0.1, 1.1, 0.3], [0.2, 1.2, 1.6], [0.3, 1.3, -0.6]])
>>> beta = numpy.array([0.1, 0.1, 0.0])
>>> Y = numpy.dot(X, beta)
>>> beta = l1l2py.algorithms.ridge_regression(X, Y, 1e3).T
>>> len(numpy.flatnonzero(beta))
3

l1l2py.algorithms.l1l2_regularization(data, labels, mu, tau, beta=None, kmax=100000, tolerance=1e-05, return_iterations=False, adaptive=False)

Implementation of the Fast Iterative Shrinkage-Thresholding Algorithm to solve a least squares problem with l1l2 penalty.

It solves the l1l2 regularization problem with parameter mu on the l2-norm and parameter tau on the l1-norm.

Parameters : data : (N, P) ndarray Data matrix. labels : (N,) or (N, 1) ndarray Labels vector. mu : float l2-norm penalty. tau : float l1-norm penalty. beta : (P,) or (P, 1) ndarray, optional (default is None) Starting value for the iterations. If None, then iterations starts from the empty model. kmax : int, optional (default is 1e5) Maximum number of iterations. tolerance : float, optional (default is 1e-5) Convergence tolerance. return_iterations : bool, optional (default is False) If True, returns the number of iterations performed. The algorithm has a predefined minimum number of iterations equal to 10. adaptive : bool, optional (default is False) If True, minimization is performed calculating an adaptive step size for each iteration. beta : (P, 1) ndarray l1l2 solution. k : int, optional Number of iterations performed.

Examples

>>> X = numpy.array([[0.1, 1.1, 0.3], [0.2, 1.2, 1.6], [0.3, 1.3, -0.6]])
>>> beta = numpy.array([0.1, 0.1, 0.0])
>>> Y = numpy.dot(X, beta)
>>> beta = l1l2py.algorithms.l1l2_regularization(X, Y, 0.1, 0.1)
>>> len(numpy.flatnonzero(beta))
1


## Utility Functions¶

l1l2py.algorithms.l1_bound(data, labels)

Estimation of an useful maximum bound for the l1 penalty term.

Fixing mu close to 0.0 and using the maximum value calculated with this function as tau in the l1l2 regularization, the solution vector contains only zero elements.

For each value of tau smaller than the maximum bound the solution vector contains at least one non zero element.

Parameters : data : (N, P) ndarray Data matrix. labels : (N,) or (N, 1) ndarray Labels vector. tau_max : float Maximum tau.

Examples

>>> X = numpy.array([[0.1, 1.1, 0.3], [0.2, 1.2, 1.6], [0.3, 1.3, -0.6]])
>>> beta = numpy.array([0.1, 0.1, 0.0])
>>> Y = numpy.dot(X, beta)
>>> tau_max = l1l2py.algorithms.l1_bound(X, Y)
>>> l1l2py.algorithms.l1l2_regularization(X, Y, 0.0, tau_max).T
array([[ 0.,  0.,  0.]])
>>> beta = l1l2py.algorithms.l1l2_regularization(X, Y, 0.0, tau_max - 1e-5)
>>> len(numpy.flatnonzero(beta))
1

l1l2py.algorithms.l1l2_path(data, labels, mu, tau_range, beta=None, kmax=100000, tolerance=1e-05, adaptive=False)

Efficient solution of different l1l2 regularization problems on increasing values of the l1-norm parameter.

Finds the l1l2 regularization path for each value in tau_range and fixed value of mu.

The values in tau_range are used during the computation in reverse order, while the output path has the same ordering of the tau values.

Note

For efficency purposes, if mu = 0.0 and the number of non-zero values is higher than N for a given value of tau (that means algorithm has reached the limit of allowed iterations), the following solutions (for smaller values of tau) are simply the least squares solutions.

Warning

The number of solutions can differ from len(tau_range). The function returns only the solutions with at least one non-zero element. For values higher than tau_max a solution have all zero values.

Parameters : data : (N, P) ndarray Data matrix. labels : (N,) or (N, 1) ndarray Labels vector. mu : float l2-norm penalty. tau_range : array_like of float l1-norm penalties in increasing order. beta : (P,) or (P, 1) ndarray, optional (default is None) Starting value of the iterations. If None, then iterations starts from the empty model. kmax : int, optional (default is 1e5) Maximum number of iterations. tolerance : float, optional (default is 1e-5) Convergence tolerance. adaptive : bool, optional (default is False) If True, minimization is performed calculating an adaptive step size for each iteration. beta_path : list of (P,) or (P, 1) ndarrays l1l2 solutions with at least one non-zero element.

Note

The acceleration method based on warm starts, implemented in this function, is been theoretically proved in [Hale08].

 [Hale08] E. T. Hale, W. Yin, Y. Zhang “Fixed-point continuation for -minimization: Methodology and convergence” SIAM J. Optim. Volume 19, Issue 3, pp. 1107-1130, 2008