In order to describe the function implemented in this module, we have to assume some notation.
Assuming to have a centered data matrix and a column vector of regression values or binary labels , we want to minimize the regression/classification error solving a Regularized Least Square (RLS) problem.
In this module two main algorithms are implemented. The first one solves a classical RLS problem with a penalty on the of the vector (also called ridge_regression)
(1)
with .
The second one minimizes a functional with a linear combination of two penalties on the and of the vector (also called l1l2_regularization)
(2)
with and .
While (1) has closedform solution, for (2) there are many different approaches. In this module we provide an Iterative ShrinkageThresholding Algorithm (ISTA) proposed in [DeMol09a] exploiting a faster variation (called FISTA) proposed in [Beck09].
Starting from a null vector , each step updates the value of until convergence:
where, is the softthresholding function
The constant is a (theorically optimal) step size which depends on the data:
where is the maximum eigenvalue of the matrix .
The convergence is reached when for each :
where and before reaches a fixed maximum number of iterations.
Implementation of the Regularized Least Squares solver.
It solves the ridge regression problem with parameter mu on the l2norm.
Parameters :  data : (N, P) ndarray
labels : (N,) or (N, 1) ndarray
mu : float, optional (default is 0.0)


Returns :  beta : (P, 1) ndarray

Examples
>>> X = numpy.array([[0.1, 1.1, 0.3], [0.2, 1.2, 1.6], [0.3, 1.3, 0.6]])
>>> beta = numpy.array([0.1, 0.1, 0.0])
>>> Y = numpy.dot(X, beta)
>>> beta = l1l2py.algorithms.ridge_regression(X, Y, 1e3).T
>>> len(numpy.flatnonzero(beta))
3
Implementation of the Fast Iterative ShrinkageThresholding Algorithm to solve a least squares problem with l1l2 penalty.
It solves the l1l2 regularization problem with parameter mu on the l2norm and parameter tau on the l1norm.
Parameters :  data : (N, P) ndarray
labels : (N,) or (N, 1) ndarray
mu : float
tau : float
beta : (P,) or (P, 1) ndarray, optional (default is None)
kmax : int, optional (default is 1e5)
tolerance : float, optional (default is 1e5)
return_iterations : bool, optional (default is False)
adaptive : bool, optional (default is False)


Returns :  beta : (P, 1) ndarray
k : int, optional

Examples
>>> X = numpy.array([[0.1, 1.1, 0.3], [0.2, 1.2, 1.6], [0.3, 1.3, 0.6]])
>>> beta = numpy.array([0.1, 0.1, 0.0])
>>> Y = numpy.dot(X, beta)
>>> beta = l1l2py.algorithms.l1l2_regularization(X, Y, 0.1, 0.1)
>>> len(numpy.flatnonzero(beta))
1
Estimation of an useful maximum bound for the l1 penalty term.
Fixing mu close to 0.0 and using the maximum value calculated with this function as tau in the l1l2 regularization, the solution vector contains only zero elements.
For each value of tau smaller than the maximum bound the solution vector contains at least one non zero element.
Parameters :  data : (N, P) ndarray
labels : (N,) or (N, 1) ndarray


Returns :  tau_max : float

Examples
>>> X = numpy.array([[0.1, 1.1, 0.3], [0.2, 1.2, 1.6], [0.3, 1.3, 0.6]])
>>> beta = numpy.array([0.1, 0.1, 0.0])
>>> Y = numpy.dot(X, beta)
>>> tau_max = l1l2py.algorithms.l1_bound(X, Y)
>>> l1l2py.algorithms.l1l2_regularization(X, Y, 0.0, tau_max).T
array([[ 0., 0., 0.]])
>>> beta = l1l2py.algorithms.l1l2_regularization(X, Y, 0.0, tau_max  1e5)
>>> len(numpy.flatnonzero(beta))
1
Efficient solution of different l1l2 regularization problems on increasing values of the l1norm parameter.
Finds the l1l2 regularization path for each value in tau_range and fixed value of mu.
The values in tau_range are used during the computation in reverse order, while the output path has the same ordering of the tau values.
Note
For efficency purposes, if mu = 0.0 and the number of nonzero values is higher than N for a given value of tau (that means algorithm has reached the limit of allowed iterations), the following solutions (for smaller values of tau) are simply the least squares solutions.
Warning
The number of solutions can differ from len(tau_range). The function returns only the solutions with at least one nonzero element. For values higher than tau_max a solution have all zero values.
Parameters :  data : (N, P) ndarray
labels : (N,) or (N, 1) ndarray
mu : float
tau_range : array_like of float
beta : (P,) or (P, 1) ndarray, optional (default is None)
kmax : int, optional (default is 1e5)
tolerance : float, optional (default is 1e5)
adaptive : bool, optional (default is False)


Returns :  beta_path : list of (P,) or (P, 1) ndarrays

Note
The acceleration method based on warm starts, implemented in this function, is been theoretically proved in [Hale08].
[Hale08]  E. T. Hale, W. Yin, Y. Zhang “Fixedpoint continuation for minimization: Methodology and convergence” SIAM J. Optim. Volume 19, Issue 3, pp. 11071130, 2008 