Skip to content

Commit 9c8296e

Browse files
klamikegdalle
andauthored
fix: formalize notations and change signs in dual variables (#21)
* update dual objective to use `lᵀ|y|⁺ - uᵀ|y|⁻` convention * [skip ci] typo * store `c - Aᵀy` in `scratch.x` * rename `scratch.r` to `scratch.z` * finish rename * Add mathematical formulation * Update src/components/errors.jl * Rename variables * Fix mul --------- Co-authored-by: Guillaume Dalle <22795598+gdalle@users.noreply.github.com>
1 parent 84d05cf commit 9c8296e

9 files changed

Lines changed: 163 additions & 87 deletions

File tree

README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -67,9 +67,12 @@ That's what makes it so cool.
6767

6868
## References
6969

70-
> [PDLP: A Practical First-Order Method for Large-Scale Linear Programming](https://arxiv.org/abs/2501.07018), Applegate et al. (2025)
71-
72-
> [An Overview of GPU-based First-Order Methods for Linear Programming and Extensions](https://arxiv.org/abs/2506.02174v1), Lu & Yang (2025)
70+
- [Practical Large-Scale Linear Programming using Primal-Dual Hybrid Gradient](https://arxiv.org/abs/2106.04756), Applegate et al. (2022)
71+
- [cuPDLP.jl: A GPU Implementation of Restarted Primal-Dual Hybrid Gradient for Linear Programming in Julia](https://arxiv.org/abs/2311.12180), Lu et al. (2024)
72+
- [cuPDLP-C: A Strengthened Implementation of cuPDLP for Linear Programming by C language](https://arxiv.org/abs/2312.14832), Lu et al. (2024)
73+
- [cuPDLPx: A Further Enhanced GPU-Based First-Order Solver for Linear Programming](https://arxiv.org/abs/2507.14051), Lu et al. (2025)
74+
- [PDLP: A Practical First-Order Method for Large-Scale Linear Programming](https://arxiv.org/abs/2501.07018), Applegate et al. (2025)
75+
- [An Overview of GPU-based First-Order Methods for Linear Programming and Extensions](https://arxiv.org/abs/2506.02174v1), Lu & Yang (2025)
7376

7477
## Roadmap
7578

docs/make.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@ makedocs(;
2222
"tutorial.md",
2323
"api.md",
2424
"Dev docs" => [
25+
"math.md",
2526
"internals.md",
26-
"preconditioning.md",
2727
],
2828
],
2929
)

docs/src/math.md

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
# Math
2+
3+
> See the references in the README for details, but beware that we adopt slightly different notational conventions.
4+
5+
## Primal and dual problems
6+
7+
PDLP solves Linear Programs (LPs) formulated as follows:
8+
9+
```math
10+
\min_x \quad c^\top x \quad \text{s.t.} \quad \begin{cases}
11+
\ell_c \leq A x \leq u_c \\
12+
\ell_v \leq x \leq u_v
13+
\end{cases}
14+
```
15+
16+
We associate non-negative multipliers $y_\ell, y_u, z_\ell, z_u \geq 0$ with all four inequality constraints, leading to the following Lagrangian:
17+
18+
```math
19+
\begin{align*}
20+
\mathcal{L}(x, y_\ell, y_u, z_\ell, z_u)
21+
& = c^\top x + y_\ell^\top (\ell_c - A x) + y_u^\top (A x - u_c) + z_\ell^\top (\ell_v - x) + z_u^\top (x - u_v) \\
22+
& = (c - A^\top y_\ell + A^\top y_u - z_\ell + z_u)^\top x + (y_\ell^\top \ell_c - y_u^\top u_c) + (z_\ell^\top \ell_v - z_u^\top u_v)
23+
\end{align*}
24+
```
25+
26+
We interpret signed multipliers $y_\ell, z_\ell$ and $y_u, z_u$ as the positive and negative parts of unsigned multipliers $y$ and $z$, associated with the constraints and the variable bounds respectively:
27+
28+
```math
29+
y = y_\ell - y_u \quad \text{and} \quad z = z_\ell - z_u
30+
```
31+
32+
which amounts to
33+
34+
```math
35+
\begin{align*}
36+
y_\ell & = y^+ & z_\ell & = z^+ \\
37+
y_u & = y^- & z_u & = z^-
38+
\end{align*}
39+
```
40+
41+
Note that if any of the bounds is infinite, the corresponding signed multiplier is constrained to be zero.
42+
We sum up these elementwise constraints by writing $y \in \mathcal{Y}$ and $z \in \mathcal{Z}$.
43+
44+
We also define the shortcut
45+
46+
```math
47+
p(y, \ell, u) = \ell^\top y^+ - u^\top y^-
48+
```
49+
50+
which leaves us with
51+
52+
```math
53+
\mathcal{L}(x, y, z) = (c - A^\top y - z)^\top x + p(y; \ell_c, u_c) + p(z; \ell_v, u_v)
54+
```
55+
56+
From there, we deduce the dual problem:
57+
58+
```math
59+
\max_{y, z} \quad p(y; \ell_c, u_c) + p(z; \ell_v, u_v) \quad \text{s.t.} \quad \begin{cases}
60+
0 = c - A^\top y - z \\
61+
y \in \mathcal{Y} \\
62+
z \in \mathcal{Z}
63+
\end{cases}
64+
```
65+
66+
The primal-dual gap (one of our stopping criteria) thus writes as
67+
68+
```math
69+
g = c^\top x - \left(p(y; \ell_c, u_c) + p(z; \ell_v, u_v)\right)
70+
```
71+
72+
## Preconditioning
73+
74+
The original problem $P$ and preconditioned problem $\tilde{P}$ are linked by:
75+
76+
- Constraint matrix $\tilde{A} = D_1 A D_2$ so $A = D_1^{-1} \tilde{A} D_2^{-1}$
77+
- Transposed constraint matrix $\tilde{A}^\top = D_2 A^\top D_1$ so $A^\top = D_2^{-1} \tilde{A}^\top D_1^{-1}$
78+
- Primal variable $\tilde{x} = D_2^{-1} x$ so $x = D_2 \tilde{x}$
79+
- Dual variable for constraints $\tilde{y} = D_1^{-1} y$ so $y = D_1 \tilde{y}$, but $\tilde{\mathcal{Y}} = \mathcal{Y}$
80+
- Dual variable for bounds $\tilde{z} = D_2 z$ so $z = D_2^{-1} \tilde{z}$, but $\tilde{\mathcal{Z}} = \mathcal{Z}$
81+
- Cost $\tilde{c} = D_2 c$ so $c = D_2^{-1} \tilde{c}$
82+
- Bounds $(\tilde{\ell}_v, \tilde{u}_v) = D_2^{-1} (\ell_v, u_v)$ so $(\ell_v, u_v) = D_2 (\tilde{\ell}_v, \tilde{u}_v)$
83+
- Constraints $(\tilde{\ell}_c, \tilde{u}_c) = D_1 (\ell_c, u_c)$ so $(\ell_c, u_c) = D_1^{-1} (\tilde{\ell}_c, \tilde{u}_c)$
84+
85+
Then we have the following terms in the KKT errors:
86+
87+
```math
88+
\begin{align*}
89+
c - A^\top y - z
90+
& = D_2^{-1} \tilde{c} - D_2^{-1} \tilde{A}^\top D_1^{-1} D_1 \tilde{y} - D_2^{-1} \tilde{z} \\
91+
& = D_2^{-1}(\tilde{c} - \tilde{A}^\top \tilde{y} - \tilde{z})
92+
\end{align*}
93+
```
94+
95+
```math
96+
\begin{align*}
97+
Ax - \mathrm{proj}_{[\ell_c,u_c]}(Ax)
98+
& = D_1^{-1} \tilde{A} D_2^{-1} D_2 \tilde{x} - \mathrm{proj}_{[D_1^{-1} \tilde{\ell}_c, D_1^{-1} \tilde{u}_c]} (D_1^{-1} \tilde{A} D_2^{-1} D_2 \tilde{x}) \\
99+
& = D_1^{-1} \tilde{A} \tilde{x} - \mathrm{proj}_{[D_1^{-1} \tilde{\ell}_c, D_1^{-1} \tilde{u}_c]} (D_1^{-1} \tilde{A} \tilde{x}) \\
100+
& = D_1^{-1} \left[\tilde{A} \tilde{x} - \mathrm{proj}_{[\tilde{\ell}_c, \tilde{u}_c]} (\tilde{A} \tilde{x})\right] \\
101+
\end{align*}
102+
```
103+
104+
```math
105+
z - \mathrm{proj}_{\mathcal{Z}}(z) = D_2^{-1} \tilde{z} - \mathrm{proj}_{\tilde{\mathcal{Z}}}(D_2^{-1} \tilde{z}) = D_2^{-1} (\tilde{z} - \mathrm{proj}_{\tilde{\mathcal{Z}}}(\tilde{z}))
106+
```
107+
108+
```math
109+
c^\top x = (D_2^{-1} \tilde{c})^\top (D_2 \tilde{x}) = \tilde{c}^\top D_2^{-1} D_2 \tilde{x} = \tilde{c}^\top \tilde{x}
110+
```
111+
112+
```math
113+
\begin{align*}
114+
p(y; \ell_c, u_c)
115+
& = \ell_c^\top y^+ - u_c^\top y^- \\
116+
& = (D_1^{-1} \tilde{\ell}_c)^\top (D_1 \tilde{y})^+ - (D_1^{-1} \tilde{u}_c)^\top (D_1 \tilde{y})^- \\
117+
& = \tilde{\ell}_c^\top D_1^{-1} D_1 \tilde{y}^+ - \tilde{u}_c^\top D_1^{-1} D_1 \tilde{y}^- \\
118+
& = \tilde{\ell}_c^\top \tilde{y}^+ - \tilde{u}_c \tilde{y}^-
119+
\end{align*}
120+
```
121+
122+
```math
123+
\begin{align*}
124+
p(z; \ell_v, u_v)
125+
& = \ell_v^\top z^+ - u_v^\top z^- \\
126+
& = (D_2 \tilde{\ell}_v)^\top (D_2^{-1} \tilde{z})^+ - (D_2 \tilde{u}_v)^\top (D_2^{-1} \tilde{z})^- \\
127+
& = \tilde{\ell}_v^\top D_2 D_2^{-1} \tilde{z}^+ - \tilde{u}_v^\top D_2 D_2^{-1} \tilde{z}^- \\
128+
& = \tilde{\ell}_v^\top \tilde{z}^+ - \tilde{u}_v^\top \tilde{z}^-
129+
\end{align*}
130+
```
131+
132+
We make use of a few key observations:
133+
134+
- Projection on $\mathcal{Z}$ commutes with scaling
135+
- Projection on an interval commutes with scaling if scaling is also applied to the interval in question

docs/src/preconditioning.md

Lines changed: 0 additions & 69 deletions
This file was deleted.

src/algorithms/pdhg.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ function initialize(
4747
η = fixed_stepsize(milp, algo.step_size)
4848
ω = one(η)
4949
step_sizes = StepSizes(; η, ω)
50-
scratch = Scratch(; x = similar(sol.x), y = similar(sol.y), r = similar(sol.x))
50+
scratch = Scratch(sol)
5151
stats = ConvergenceStats(T; starting_time)
5252
state = PDHGState(; sol, sol_last, step_sizes, scratch, stats)
5353
return state

src/algorithms/pdlp.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ function initialize(
5555
η = fixed_stepsize(milp, algo.step_size)
5656
ω = primal_weight_init(milp, algo.step_size)
5757
step_sizes = StepSizes(; η, ω)
58-
scratch = Scratch(; x = similar(sol.x), y = similar(sol.y), r = similar(sol.x))
58+
scratch = Scratch(sol)
5959
iteration = IterationCounter(0, 0, 0)
6060
restart_stats = RestartStats(T)
6161
stats = ConvergenceStats(T; starting_time)

src/components/errors.jl

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -71,31 +71,38 @@ function kkt_errors!(
7171
(; c, lv, uv, A, At, lc, uc, D1, D2) = milp
7272

7373
A_x = mul!(scratch.y, A, x)
74-
At_y = mul!(scratch.x, At, y)
75-
r = @. scratch.r = proj_multiplier(c - At_y, lv, uv)
74+
c_At_y = mul!(scratch.x, At, y, -one(T), zero(T))
75+
c_At_y .+= c
76+
z = @. scratch.z = proj_multiplier(c_At_y, lv, uv)
7677

7778
primal_diff = @. scratch.y = inv(D1.diag) * (A_x - clamp(A_x, lc, uc))
7879
primal = norm(primal_diff)
80+
7981
rescaled_combined_bounds = @. scratch.y = inv(D1.diag) * combine(lc, uc)
8082
primal_scale = one(T) + norm(rescaled_combined_bounds)
8183

82-
dual_diff = @. scratch.x = inv(D2.diag) * (c - At_y - r)
84+
dual_diff = @. scratch.x = inv(D2.diag) * (c_At_y - z)
8385
dual = norm(dual_diff)
86+
8487
rescaled_obj = @. scratch.x = inv(D2.diag) * c
8588
dual_scale = one(T) + norm(rescaled_obj)
8689

90+
# dual objective: lᵀ|y|⁺ - uᵀ|y|⁻ + lᵥᵀ|z|⁺ - uᵥᵀ|z|⁻
91+
# We reformulate to ∑ⱼ (l⋅|y|⁺ - u⋅|y|⁻)ⱼ + ∑ᵢ (lᵥ⋅|z|⁺ - uᵥ⋅|z|⁻)ᵢ
92+
# where pc = (l⋅|y|⁺ - u⋅|y|⁻) and pv = (lᵥ⋅|z|⁺ - uᵥ⋅|z|⁻)
8793
pc = @. scratch.y = (
88-
safeprod_left(uc, positive_part(-y)) - safeprod_left(lc, negative_part(-y))
94+
safeprod_left(lc, positive_part(y)) - safeprod_left(uc, negative_part(y))
8995
)
90-
pv = @. scratch.r = (
91-
safeprod_left(uv, positive_part(-r)) - safeprod_left(lv, negative_part(-r))
96+
pv = @. scratch.z = (
97+
safeprod_left(lv, positive_part(z)) - safeprod_left(uv, negative_part(z))
9298
)
9399
pc_sum = sum(pc)
94100
pv_sum = sum(pv)
95101
cx = dot(c, x)
102+
dobj = pc_sum + pv_sum
96103

97-
gap = abs(cx + pc_sum + pv_sum)
98-
gap_scale = one(T) + abs(pc_sum + pv_sum) + abs(cx)
104+
gap = abs(cx - dobj)
105+
gap_scale = one(T) + abs(dobj) + abs(cx)
99106

100107
err = KKTErrors(;
101108
primal,

src/components/scratch.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"dual scratch (length `ncons`)"
55
y::V
66
"dual scratch (length `nvar`)"
7-
r::V
7+
z::V
88
end
99

1010
Scratch(sol::PrimalDualSolution) = Scratch(similar(sol.x), similar(sol.y), similar(sol.x))

test/components/errors.jl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ function p(y, l, u)
77
y⁻ = CoolPDLP.negative_part.(y)
88
u_noinf = CoolPDLP.safe.(u)
99
l_noinf = CoolPDLP.safe.(l)
10-
return dot(y⁺, u_noinf) - dot(y⁻, l_noinf)
10+
return dot(y⁺, l_noinf) - dot(y⁻, u_noinf)
1111
end
1212

1313
milp, sol = CoolPDLP.random_milp_and_sol(100, 200, 0.4)
@@ -27,10 +27,10 @@ err_p = CoolPDLP.kkt_errors!(scratch, sol_p, milp_p)
2727
@testset "Correct KKT errors" begin
2828
@test err.primal norm(A * x - CoolPDLP.clamp.(A * x, lc, uc))
2929
@test err.dual norm(c - At * y - r)
30-
@test err.gap abs(dot(c, x) + p(-y, lc, uc) + p(-r, lv, uv))
30+
@test err.gap abs(dot(c, x) - (p(y, lc, uc) + p(r, lv, uv)))
3131
@test err.primal_scale 1 + norm(CoolPDLP.combine.(lc, uc))
3232
@test err.dual_scale 1 + norm(c)
33-
@test err.gap_scale 1 + abs(dot(c, x)) + abs(p(-y, lc, uc) + p(-r, lv, uv))
33+
@test err.gap_scale 1 + abs(dot(c, x)) + abs(p(y, lc, uc) + p(r, lv, uv))
3434
end
3535

3636
@testset "Invariance by preconditioning" begin

0 commit comments

Comments
 (0)