Proximal gradient method

Lua error in package.lua at line 80: module 'strict' not found.

Proximal gradient methods are a generalized form of projection used to solve non-differentiable convex optimization problems. Many interesting problems can be formulated as convex optimization problems of form

\operatorname{minimize}_{x \in \mathbb{R}^N} \qquad f_1(x) +f_2(x) + \cdots+ f_{n-1}(x) +f_n(x)

where $f_1, f_2, ..., f_n$ are convex functions defined from $f: \mathbb{R}^N \rightarrow \mathbb{R}$ where some of the functions are non-differentiable, this rules out our conventional smooth optimization techniques like Steepest descent method, conjugate gradient method etc. There is a specific class of algorithms which can solve above optimization problem. These methods proceed by splitting, in that the functions $f_1, . . . , f_n$ are used individually so as to yield an easily implementable algorithm. They are called proximal because each non smooth function among $f_1, . . . , f_n$ is involved via its proximity operator. Iterative Shrinkage thresholding algorithm, projected Landweber, projected gradient, alternating projections, alternating-direction method of multipliers, alternating split Bregman are special instances of proximal algorithms. Details of proximal methods are discussed in Combettes and Pesquet.^[1] For the theory of proximal gradient methods from the perspective of and with applications to statistical learning theory, see proximal gradient methods for learning.

Notations and terminology

Let $\mathbb{R}^N$ , the $N$ -dimensional euclidean space, be the domain of the function $f: \mathbb{R}^N \rightarrow (-\infty,+\infty]$ . Suppose $C$ is a non-empty convex subset of $\mathbb{R}^N$ . Then, the indicator function of $C$ is defined as

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): i_C : x \mapsto \begin{cases} 0 & \text{if } x \in C \\ + \infty & \text{if } x \notin C \end{cases}

p

-norm is defined as (

\| \cdot \|_p

)

\|x\|_p = ( |x_1|^p + |x_2|^p + \cdots + |x_N|^p )^{1/p}

The distance from $x \in \mathbb{R}^N$ to $C$ is defined as

D_C(x) = \min_{y \in C} \|x - y\|

If $C$ is closed and convex, the projection of $x \in \mathbb{R}^N$ onto $C$ is the unique point $P_Cx \in C$ such that $D_C(x) = \| x - P_Cx \|_2$ .

The subdifferential of $f$ is given by

\partial f = \{ u \in \mathbb{R}^N \mid \forall y \in \mathbb{R}^N, (y-x)^\mathrm{T}u+f(x) \leq f(y)).\}

Projection onto convex sets (POCS)

One of the widely used convex optimization algorithms is POCS (Projection Onto Convex Sets). This algorithm is employed to recover/synthesize a signal satisfying simultaneously several convex constraints. Let $f_i$ be the indicator function of non-empty closed convex set $C_i$ modeling a constraint. This reduces to convex feasibility problem, which require us to find a solution such that it lies in the intersection of all convex sets $C_i$ . In POCS method each set $C_i$ is incorporated by its projection operator $P_{C_i}$ . So in each iteration $x$ is updated as

x_{k+1} = P_{C_1} P_{C_2} \cdots P_{C_n} x_k

However beyond such problems projection operators are not appropriate and more general operators are required to tackle them. Among the various generalizations of the notion of a convex projection operator that exist, proximity operators are best suited for other purposes.

Definition

Proximity operators of function $f$ at $x$ is defined as

For every $x \in \mathbb{R}^N$ , the minimization problem

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \text{minimize}_{y \in C} \qquad f(y) + \frac{1}{2} \left\| x-y \right\|_2^2

admits a unique solution which is denoted by $\operatorname{prox}_f(x)$ .

\operatorname{prox}_f(x) :\mathbb{R}^N \rightarrow \mathbb{R}^N

The proximity operator of $f$ is characterized by inclusion

p=\operatorname{prox}_f(x) \Leftrightarrow x-p \in \partial f(p) \qquad (\forall(x,p) \in \mathbb{R}^N \times \mathbb{R}^N)

If $f$ is differentiable then above equation reduces to

p=\operatorname{prox}_f(x) \Leftrightarrow x-p = \nabla f(p) \quad (\forall(x,p) \in \mathbb{R}^N \times \mathbb{R}^N)

Examples

Special instances of Proximal Gradient Methods are

Projected Landweber
Alternating projection
Alternating-direction method of multipliers
Fast Iterative Shrinkage Thresholding Algorithm (FISTA)^[2]

References

Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found.

Notes

↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.

External links

Stephen Boyd and Lieven Vandenberghe Book, Convex optimization
EE364a: Convex Optimization I and EE364b: Convex Optimization II, Stanford course homepages
EE227A: Lieven Vandenberghe Notes Lecture 18

[1] Lua error in package.lua at line 80: module 'strict' not found.

[2] Lua error in package.lua at line 80: module 'strict' not found.

[1]

[2]

Proximal gradient method

Contents

Notations and terminology

Projection onto convex sets (POCS)

Definition

Examples

See also

References

Notes

External links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools