Proximal gradient method

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Lua error in package.lua at line 80: module 'strict' not found.

Proximal gradient methods are a generalized form of projection used to solve non-differentiable convex optimization problems. Many interesting problems can be formulated as convex optimization problems of form


\operatorname{minimize}_{x \in \mathbb{R}^N} \qquad f_1(x) +f_2(x) + \cdots+ f_{n-1}(x) +f_n(x)

where f_1, f_2, ..., f_n are convex functions defined from f: \mathbb{R}^N  \rightarrow \mathbb{R} where some of the functions are non-differentiable, this rules out our conventional smooth optimization techniques like Steepest descent method, conjugate gradient method etc. There is a specific class of algorithms which can solve above optimization problem. These methods proceed by splitting, in that the functions f_1, . . . , f_n are used individually so as to yield an easily implementable algorithm. They are called proximal because each non smooth function among f_1, . . . , f_n is involved via its proximity operator. Iterative Shrinkage thresholding algorithm, projected Landweber, projected gradient, alternating projections, alternating-direction method of multipliers, alternating split Bregman are special instances of proximal algorithms. Details of proximal methods are discussed in Combettes and Pesquet.[1] For the theory of proximal gradient methods from the perspective of and with applications to statistical learning theory, see proximal gradient methods for learning.

Notations and terminology

Let \mathbb{R}^N, the N-dimensional euclidean space, be the domain of the function  f: \mathbb{R}^N  \rightarrow (-\infty,+\infty]. Suppose C is a non-empty convex subset of \mathbb{R}^N. Then, the indicator function of C is defined as

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): i_C : x \mapsto \begin{cases} 0 & \text{if } x \in C \\ + \infty & \text{if } x \notin C \end{cases}


p-norm is defined as ( \| \cdot \|_p )

\|x\|_p = ( |x_1|^p + |x_2|^p + \cdots + |x_N|^p )^{1/p}

The distance from x \in \mathbb{R}^N to C is defined as


D_C(x) =  \min_{y \in C} \|x - y\|

If C is closed and convex, the projection of x \in \mathbb{R}^N onto C is the unique point P_Cx \in C such that D_C(x) = \| x - P_Cx \|_2 .

The subdifferential of f is given by


 \partial f = \{ u \in \mathbb{R}^N \mid \forall y \in \mathbb{R}^N, (y-x)^\mathrm{T}u+f(x) \leq f(y)).\}

Projection onto convex sets (POCS)

One of the widely used convex optimization algorithms is POCS (Projection Onto Convex Sets). This algorithm is employed to recover/synthesize a signal satisfying simultaneously several convex constraints. Let f_i be the indicator function of non-empty closed convex set C_i modeling a constraint. This reduces to convex feasibility problem, which require us to find a solution such that it lies in the intersection of all convex sets C_i. In POCS method each set C_i is incorporated by its projection operator P_{C_i}. So in each iteration x is updated as


x_{k+1} = P_{C_1}  P_{C_2} \cdots P_{C_n} x_k

However beyond such problems projection operators are not appropriate and more general operators are required to tackle them. Among the various generalizations of the notion of a convex projection operator that exist, proximity operators are best suited for other purposes.

Definition

Proximity operators of function f at x is defined as

For every x \in \mathbb{R}^N , the minimization problem

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \text{minimize}_{y \in C} \qquad f(y) + \frac{1}{2} \left\| x-y \right\|_2^2

admits a unique solution which is denoted by \operatorname{prox}_f(x).


 \operatorname{prox}_f(x) :\mathbb{R}^N \rightarrow \mathbb{R}^N

The proximity operator of f is characterized by inclusion


p=\operatorname{prox}_f(x) \Leftrightarrow x-p \in \partial f(p) \qquad (\forall(x,p) \in \mathbb{R}^N \times \mathbb{R}^N)

If f is differentiable then above equation reduces to


p=\operatorname{prox}_f(x) \Leftrightarrow x-p = \nabla f(p) \quad (\forall(x,p) \in \mathbb{R}^N \times \mathbb{R}^N)

Examples

Special instances of Proximal Gradient Methods are

See also

References

  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.

Notes

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. Lua error in package.lua at line 80: module 'strict' not found.

External links