Apply suggestions from code review

MohamedLaghdafHABIBOULLAH · MaxenceGollier · web-flow · commit fca3a3004e74 · 2025-09-06T15:46:51.000-04:00
Co-authored-by: Maxence Gollier &lt;134112149+MaxenceGollier@users.noreply.github.com&gt;
diff --git a/paper/paper.bib b/paper/paper.bib
@@ -7,8 +7,6 @@ @Article{         aravkin-baraldi-orban-2022
   Number        = 2,
   Pages         = {900--929},
   doi           = {10.1137/21M1409536},
-  abstract      = { We develop a trust-region method for minimizing the sum of a smooth term (f) and a nonsmooth term (h), both of which can be nonconvex. Each iteration of our method minimizes a possibly nonconvex model of (f + h) in a trust region. The model coincides with (f + h) in value and subdifferential at the center. We establish global convergence to a first-order stationary point when (f) satisfies a smoothness condition that holds, in particular, when it has a Lipschitz-continuous gradient, and (h) is proper and lower semicontinuous. The model of (h) is required to be proper, lower semi-continuous and prox-bounded. Under these weak assumptions, we establish a worst-case (O(1/\epsilon^2)) iteration complexity bound that matches the best known complexity bound of standard trust-region methods for smooth optimization. We detail a special instance, named TR-PG, in which we use a limited-memory quasi-Newton model of (f) and compute a step with the proximal gradient method,
-                  resulting in a practical proximal quasi-Newton method. We establish similar convergence properties and complexity bound for a quadratic regularization variant, named R2, and provide an interpretation as a proximal gradient method with adaptive step size for nonconvex problems. R2 may also be used to compute steps inside the trust-region method, resulting in an implementation named TR-R2. We describe our Julia implementations and report numerical results on inverse problems from sparse optimization and signal processing. Both TR-PG and TR-R2 exhibit promising performance and compare favorably with two linesearch proximal quasi-Newton methods based on convex models. },
 }
 
 @Article{         aravkin-baraldi-orban-2024,
@@ -20,10 +18,6 @@ @Article{         aravkin-baraldi-orban-2024
   Number        = 4,
   Pages         = {A2557--A2581},
   doi           = {10.1137/22M1538971},
-  preprint      = {https://www.gerad.ca/en/papers/G-2022-58/view},
-  grant         = nserc,
-  abstract      = { Abstract. We develop a Levenberg–Marquardt method for minimizing the sum of a smooth nonlinear least-squares term \(f(x) = \frac{1}{2} \|F(x)\|\_2^2\) and a nonsmooth term \(h\). Both \(f\) and \(h\) may be nonconvex. Steps are computed by minimizing the sum of a regularized linear least-squares model and a model of \(h\) using a first-order method such as the proximal gradient method. We establish global convergence to a first-order stationary point under the assu mptions that \(F\) and its Jacobian are Lipschitz continuous and \(h\) is proper and lower semicontinuous. In the worst case, our method performs \(O(\epsilon^{-2})\) iterations to bring a measure of stationarity below \(\epsilon \in (0, 1)\) . We also derive a trust-region variant that enjoys similar asymptotic worst-case iteration complexity as a special case of the trust-region algorithm of Aravkin, Baraldi, and Orban [SIAM J. Optim., 32 (2022), pp. 900–929]. We report numerica l results on three
-                  examples: a group-lasso basis-pursuit denoise example, a nonlinear support vector machine, and parameter estimation in a neuroscience application. To implement those examples, we describe in detail how to evaluate proximal operators for separable \(h\) and for the group lasso with trust-region constraint. In all cases, the Levenberg–Marquardt methods perform fewer outer iterations than either a proximal gradient method with adaptive step length or a quasi-Newto n trust-region method, neither of which exploit the least-squares structure of the problem. Our results also highlight the need for more sophisticated subproblem solvers than simple first-order methods. },
 }
 
 @Software{        leconte_linearoperators_jl_linear_operators_2023,
@@ -35,15 +29,15 @@ @Software{        leconte_linearoperators_jl_linear_operators_2023
   Year          = 2023,
 }
 
-@TechReport{      leconte-orban-2023,
+@Article{         leconte-orban-2023,
   Author        = {G. Leconte and D. Orban},
   Title         = {The Indefinite Proximal Gradient Method},
-  Institution   = gerad,
-  Year          = 2023,
-  Type          = {Cahier},
-  Number        = {G-2023-37},
-  Address       = gerad-address,
-  doi           = {10.13140/RG.2.2.11836.41606},
+  Journal       = coap,
+  Year          = 2025,
+  Volume        = 91,
+  Number        = 2,
+  Pages         = 861--903,
+  doi           = {10.1007/s10589-024-00604-5},
 }
 
 @TechReport{      leconte-orban-2023-2,
diff --git a/paper/paper.md b/paper/paper.md
@@ -55,7 +55,7 @@ Moreover, they can handle cases where Hessian approximations are unbounded[@diou
 
 There exists a way to solve \eqref{eq:nlp} in Julia via [ProximalAlgorithms.jl](https://github.com/JuliaFirstOrder/ProximalAlgorithms.jl).
 It implements several proximal algorithms for nonsmooth optimization.
-However, the available examples only consider convex instances of $h$, nmaely the $\ell_1$ norm and there are no tests for memory allocations.
+However, the available examples only consider convex instances of $h$, namely the $\ell_1$ norm and there are no tests for memory allocations.
 Moreover, it implements only one quasi-Newton method (L-BFGS) and does not support Hessian approximations via linear operators.
 In contrast, **RegularizedOptimization.jl** leverages [LinearOperators.jl](https://github.com/JuliaSmoothOptimizers/LinearOperators.jl)[@leconte_linearoperators_jl_linear_operators_2023] to represent a variety of Hessian approximations, such as L-SR1, L-BFGS, and diagonal approximations.
 
@@ -77,7 +77,7 @@ The design of the package is motivated by recent advances in the complexity anal
 - **Model Hessians (quasi-Newton, diagonal approximations)** via [LinearOperators.jl](https://github.com/JuliaSmoothOptimizers/LinearOperators.jl), which represents Hessians as linear operators and implements efficient Hessian–vector products.
 - **Definition of $h$** via [ProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ProximalOperators.jl), which offers a large collection of nonsmooth terms $h$, and [ShiftedProximalOperators.jl](https://github.com/JuliaSmoothOptimizers/ShiftedProximalOperators.jl), which provides shifted proximal mappings.
 
-This modularity makes it easy to prototype, benchmark, and extend regularization-based methods [@diouane-habiboullah-orban-2024],[@aravkin-baraldi-orban-2022],[@aravkin-baraldi-orban-2024],[@leconte-orban-2023-2] and [@diouane-gollier-orban-2024].
+This modularity makes it easy to prototype, benchmark, and extend regularization-based methods [@diouane-habiboullah-orban-2024],[@aravkin-baraldi-orban-2022],[@aravkin-baraldi-orban-2024] and[@leconte-orban-2023-2].
 
 ## Support for inexact subproblem solves