Loading...
Please wait, while we are loading the content...
Similar Documents
Convergence Rate of Proximal-Gradient with a General Step-Size (2014)
Content Provider | CiteSeerX |
---|---|
Author | Schmidt, Mark |
Abstract | We extend the previous analysis of Schmidt et al. [2011] to derive the linear convergence rate obtained by the proximal-gradient method under a general step-size scheme, for the problem of optimizing the sum of a smooth strongly-convex function and a simple (but potentially non-smooth) convex function. 1 Overview and Assumptions We consider minimization problems of the form min x∈Rd f(x): = g(x) + h(x), (1.1) where g a is strongly-convex function with parameter µ, g ′ is Lipschitz-continuous with parameter L, and h is only required to be a lower semi-continuous proper convex function. This class includes the elastic-net regularized least-squares problem min x∈Rd 1 2‖Ax − b‖ 2 + λ22 ‖x‖ 2 + λ1‖x‖1, with g(x) = 12‖Ax − b‖2 + λ22 ‖x‖2 and h(x) = λ1‖x‖1. In this case, L = σmax(ATA) + λ2 and µ = σmin(ATA)+λ2. In this work we’ll analyze the proximal-gradient algorithm, which uses iterations of the form xk+1 = prox[xk − αg′(xk)], (1.2) where α> 0 is the step-size and the proximal operator is prox(x) = argmin y∈Rd 1 2‖x − y‖ 2 + αh(y). (1.3) Our prior results in Schmidt et al. [2011, Proposition 3] show that with a step-size of α = 1/L that the iterates of this algorithm have a linear convergence rate,∥∥xk − x∗∥ ∥ ≤ (1 − µ |
File Format | |
Publisher Date | 2014-01-01 |
Access Restriction | Open |
Subject Keyword | Convergence Rate General Step-size Linear Convergence Rate Convex Function Prior Result Prox Xk Smooth Strongly-convex Function Minimization Problem Previous Analysis Argmin Rd Proximal-gradient Algorithm Semi-continuous Proper Convex Function General Step-size Scheme Proximal-gradient Method Proximal Operator Strongly-convex Function Form Min Rd |
Content Type | Text |