Skip to content

Commit fca3a30

Browse files
Apply suggestions from code review
Co-authored-by: Maxence Gollier <134112149+MaxenceGollier@users.noreply.github.com>
1 parent 5b9e859 commit fca3a30

2 files changed

Lines changed: 9 additions & 15 deletions

File tree

paper/paper.bib

Lines changed: 7 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,6 @@ @Article{ aravkin-baraldi-orban-2022
77
Number = 2,
88
Pages = {900--929},
99
doi = {10.1137/21M1409536},
10-
abstract = { We develop a trust-region method for minimizing the sum of a smooth term (f) and a nonsmooth term (h), both of which can be nonconvex. Each iteration of our method minimizes a possibly nonconvex model of (f + h) in a trust region. The model coincides with (f + h) in value and subdifferential at the center. We establish global convergence to a first-order stationary point when (f) satisfies a smoothness condition that holds, in particular, when it has a Lipschitz-continuous gradient, and (h) is proper and lower semicontinuous. The model of (h) is required to be proper, lower semi-continuous and prox-bounded. Under these weak assumptions, we establish a worst-case (O(1/\epsilon^2)) iteration complexity bound that matches the best known complexity bound of standard trust-region methods for smooth optimization. We detail a special instance, named TR-PG, in which we use a limited-memory quasi-Newton model of (f) and compute a step with the proximal gradient method,
11-
resulting in a practical proximal quasi-Newton method. We establish similar convergence properties and complexity bound for a quadratic regularization variant, named R2, and provide an interpretation as a proximal gradient method with adaptive step size for nonconvex problems. R2 may also be used to compute steps inside the trust-region method, resulting in an implementation named TR-R2. We describe our Julia implementations and report numerical results on inverse problems from sparse optimization and signal processing. Both TR-PG and TR-R2 exhibit promising performance and compare favorably with two linesearch proximal quasi-Newton methods based on convex models. },
1210
}
1311

1412
@Article{ aravkin-baraldi-orban-2024,
@@ -20,10 +18,6 @@ @Article{ aravkin-baraldi-orban-2024
2018
Number = 4,
2119
Pages = {A2557--A2581},
2220
doi = {10.1137/22M1538971},
23-
preprint = {https://www.gerad.ca/en/papers/G-2022-58/view},
24-
grant = nserc,
25-
abstract = { Abstract. We develop a Levenberg–Marquardt method for minimizing the sum of a smooth nonlinear least-squares term \(f(x) = \frac{1}{2} \|F(x)\|\_2^2\) and a nonsmooth term \(h\). Both \(f\) and \(h\) may be nonconvex. Steps are computed by minimizing the sum of a regularized linear least-squares model and a model of \(h\) using a first-order method such as the proximal gradient method. We establish global convergence to a first-order stationary point under the assu mptions that \(F\) and its Jacobian are Lipschitz continuous and \(h\) is proper and lower semicontinuous. In the worst case, our method performs \(O(\epsilon^{-2})\) iterations to bring a measure of stationarity below \(\epsilon \in (0, 1)\) . We also derive a trust-region variant that enjoys similar asymptotic worst-case iteration complexity as a special case of the trust-region algorithm of Aravkin, Baraldi, and Orban [SIAM J. Optim., 32 (2022), pp. 900–929]. We report numerica l results on three
26-
examples: a group-lasso basis-pursuit denoise example, a nonlinear support vector machine, and parameter estimation in a neuroscience application. To implement those examples, we describe in detail how to evaluate proximal operators for separable \(h\) and for the group lasso with trust-region constraint. In all cases, the Levenberg–Marquardt methods perform fewer outer iterations than either a proximal gradient method with adaptive step length or a quasi-Newto n trust-region method, neither of which exploit the least-squares structure of the problem. Our results also highlight the need for more sophisticated subproblem solvers than simple first-order methods. },
2721
}
2822

2923
@Software{ leconte_linearoperators_jl_linear_operators_2023,
@@ -35,15 +29,15 @@ @Software{ leconte_linearoperators_jl_linear_operators_2023
3529
Year = 2023,
3630
}
3731

38-
@TechReport{ leconte-orban-2023,
32+
@Article{ leconte-orban-2023,
3933
Author = {G. Leconte and D. Orban},
4034
Title = {The Indefinite Proximal Gradient Method},
41-
Institution = gerad,
42-
Year = 2023,
43-
Type = {Cahier},
44-
Number = {G-2023-37},
45-
Address = gerad-address,
46-
doi = {10.13140/RG.2.2.11836.41606},
35+
Journal = coap,
36+
Year = 2025,
37+
Volume = 91,
38+
Number = 2,
39+
Pages = 861--903,
40+
doi = {10.1007/s10589-024-00604-5},
4741
}
4842

4943
@TechReport{ leconte-orban-2023-2,

paper/paper.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ Moreover, they can handle cases where Hessian approximations are unbounded[@diou
5555

5656
There exists a way to solve \eqref{eq:nlp} in Julia via [ProximalAlgorithms.jl](https://github.com/JuliaFirstOrder/ProximalAlgorithms.jl).
5757
It implements several proximal algorithms for nonsmooth optimization.
58-
However, the available examples only consider convex instances of $h$, nmaely the $\ell_1$ norm and there are no tests for memory allocations.
58+
However, the available examples only consider convex instances of $h$, namely the $\ell_1$ norm and there are no tests for memory allocations.
5959
Moreover, it implements only one quasi-Newton method (L-BFGS) and does not support Hessian approximations via linear operators.
6060
In contrast, **RegularizedOptimization.jl** leverages [LinearOperators.jl](https://github.com/JuliaSmoothOptimizers/LinearOperators.jl)[@leconte_linearoperators_jl_linear_operators_2023] to represent a variety of Hessian approximations, such as L-SR1, L-BFGS, and diagonal approximations.
6161

@@ -77,7 +77,7 @@ The design of the package is motivated by recent advances in the complexity anal
7777
- **Model Hessians (quasi-Newton, diagonal approximations)** via [LinearOperators.jl](https://github.com/JuliaSmoothOptimizers/LinearOperators.jl), which represents Hessians as linear operators and implements efficient Hessian–vector products.
7878
- **Definition of $h$** via [ProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ProximalOperators.jl), which offers a large collection of nonsmooth terms $h$, and [ShiftedProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ShiftedProximalOperators.jl), which provides shifted proximal mappings.
7979

80-
This modularity makes it easy to prototype, benchmark, and extend regularization-based methods [@diouane-habiboullah-orban-2024],[@aravkin-baraldi-orban-2022],[@aravkin-baraldi-orban-2024],[@leconte-orban-2023-2] and [@diouane-gollier-orban-2024].
80+
This modularity makes it easy to prototype, benchmark, and extend regularization-based methods [@diouane-habiboullah-orban-2024],[@aravkin-baraldi-orban-2022],[@aravkin-baraldi-orban-2024] and[@leconte-orban-2023-2].
8181

8282
## Support for inexact subproblem solves
8383

0 commit comments

Comments
 (0)