|
| 1 | +# Math |
| 2 | + |
| 3 | +> See the references in the README for details, but beware that we adopt slightly different notational conventions. |
| 4 | +
|
| 5 | +## Primal and dual problems |
| 6 | + |
| 7 | +PDLP solves Linear Programs (LPs) formulated as follows: |
| 8 | + |
| 9 | +```math |
| 10 | +\min_x \quad c^\top x \quad \text{s.t.} \quad \begin{cases} |
| 11 | +\ell_c \leq A x \leq u_c \\ |
| 12 | +\ell_v \leq x \leq u_v |
| 13 | +\end{cases} |
| 14 | +``` |
| 15 | + |
| 16 | +We associate non-negative multipliers $y_\ell, y_u, z_\ell, z_u \geq 0$ with all four inequality constraints, leading to the following Lagrangian: |
| 17 | + |
| 18 | +```math |
| 19 | +\begin{align*} |
| 20 | +\mathcal{L}(x, y_\ell, y_u, z_\ell, z_u) |
| 21 | +& = c^\top x + y_\ell^\top (\ell_c - A x) + y_u^\top (A x - u_c) + z_\ell^\top (\ell_v - x) + z_u^\top (x - u_v) \\ |
| 22 | +& = (c - A^\top y_\ell + A^\top y_u - z_\ell + z_u)^\top x + (y_\ell^\top \ell_c - y_u^\top u_c) + (z_\ell^\top \ell_v - z_u^\top u_v) |
| 23 | +\end{align*} |
| 24 | +``` |
| 25 | + |
| 26 | +We interpret signed multipliers $y_\ell, z_\ell$ and $y_u, z_u$ as the positive and negative parts of unsigned multipliers $y$ and $z$, associated with the constraints and the variable bounds respectively: |
| 27 | + |
| 28 | +```math |
| 29 | +y = y_\ell - y_u \quad \text{and} \quad z = z_\ell - z_u |
| 30 | +``` |
| 31 | + |
| 32 | +which amounts to |
| 33 | + |
| 34 | +```math |
| 35 | +\begin{align*} |
| 36 | +y_\ell & = y^+ & z_\ell & = z^+ \\ |
| 37 | +y_u & = y^- & z_u & = z^- |
| 38 | +\end{align*} |
| 39 | +``` |
| 40 | + |
| 41 | +Note that if any of the bounds is infinite, the corresponding signed multiplier is constrained to be zero. |
| 42 | +We sum up these elementwise constraints by writing $y \in \mathcal{Y}$ and $z \in \mathcal{Z}$. |
| 43 | + |
| 44 | +We also define the shortcut |
| 45 | + |
| 46 | +```math |
| 47 | +p(y, \ell, u) = \ell^\top y^+ - u^\top y^- |
| 48 | +``` |
| 49 | + |
| 50 | +which leaves us with |
| 51 | + |
| 52 | +```math |
| 53 | +\mathcal{L}(x, y, z) = (c - A^\top y - z)^\top x + p(y; \ell_c, u_c) + p(z; \ell_v, u_v) |
| 54 | +``` |
| 55 | + |
| 56 | +From there, we deduce the dual problem: |
| 57 | + |
| 58 | +```math |
| 59 | +\max_{y, z} \quad p(y; \ell_c, u_c) + p(z; \ell_v, u_v) \quad \text{s.t.} \quad \begin{cases} |
| 60 | +0 = c - A^\top y - z \\ |
| 61 | +y \in \mathcal{Y} \\ |
| 62 | +z \in \mathcal{Z} |
| 63 | +\end{cases} |
| 64 | +``` |
| 65 | + |
| 66 | +The primal-dual gap (one of our stopping criteria) thus writes as |
| 67 | + |
| 68 | +```math |
| 69 | +g = c^\top x - \left(p(y; \ell_c, u_c) + p(z; \ell_v, u_v)\right) |
| 70 | +``` |
| 71 | + |
| 72 | +## Preconditioning |
| 73 | + |
| 74 | +The original problem $P$ and preconditioned problem $\tilde{P}$ are linked by: |
| 75 | + |
| 76 | +- Constraint matrix $\tilde{A} = D_1 A D_2$ so $A = D_1^{-1} \tilde{A} D_2^{-1}$ |
| 77 | +- Transposed constraint matrix $\tilde{A}^\top = D_2 A^\top D_1$ so $A^\top = D_2^{-1} \tilde{A}^\top D_1^{-1}$ |
| 78 | +- Primal variable $\tilde{x} = D_2^{-1} x$ so $x = D_2 \tilde{x}$ |
| 79 | +- Dual variable for constraints $\tilde{y} = D_1^{-1} y$ so $y = D_1 \tilde{y}$, but $\tilde{\mathcal{Y}} = \mathcal{Y}$ |
| 80 | +- Dual variable for bounds $\tilde{z} = D_2 z$ so $z = D_2^{-1} \tilde{z}$, but $\tilde{\mathcal{Z}} = \mathcal{Z}$ |
| 81 | +- Cost $\tilde{c} = D_2 c$ so $c = D_2^{-1} \tilde{c}$ |
| 82 | +- Bounds $(\tilde{\ell}_v, \tilde{u}_v) = D_2^{-1} (\ell_v, u_v)$ so $(\ell_v, u_v) = D_2 (\tilde{\ell}_v, \tilde{u}_v)$ |
| 83 | +- Constraints $(\tilde{\ell}_c, \tilde{u}_c) = D_1 (\ell_c, u_c)$ so $(\ell_c, u_c) = D_1^{-1} (\tilde{\ell}_c, \tilde{u}_c)$ |
| 84 | + |
| 85 | +Then we have the following terms in the KKT errors: |
| 86 | + |
| 87 | +```math |
| 88 | +\begin{align*} |
| 89 | +c - A^\top y - z |
| 90 | +& = D_2^{-1} \tilde{c} - D_2^{-1} \tilde{A}^\top D_1^{-1} D_1 \tilde{y} - D_2^{-1} \tilde{z} \\ |
| 91 | +& = D_2^{-1}(\tilde{c} - \tilde{A}^\top \tilde{y} - \tilde{z}) |
| 92 | +\end{align*} |
| 93 | +``` |
| 94 | + |
| 95 | +```math |
| 96 | +\begin{align*} |
| 97 | +Ax - \mathrm{proj}_{[\ell_c,u_c]}(Ax) |
| 98 | +& = D_1^{-1} \tilde{A} D_2^{-1} D_2 \tilde{x} - \mathrm{proj}_{[D_1^{-1} \tilde{\ell}_c, D_1^{-1} \tilde{u}_c]} (D_1^{-1} \tilde{A} D_2^{-1} D_2 \tilde{x}) \\ |
| 99 | +& = D_1^{-1} \tilde{A} \tilde{x} - \mathrm{proj}_{[D_1^{-1} \tilde{\ell}_c, D_1^{-1} \tilde{u}_c]} (D_1^{-1} \tilde{A} \tilde{x}) \\ |
| 100 | +& = D_1^{-1} \left[\tilde{A} \tilde{x} - \mathrm{proj}_{[\tilde{\ell}_c, \tilde{u}_c]} (\tilde{A} \tilde{x})\right] \\ |
| 101 | +\end{align*} |
| 102 | +``` |
| 103 | + |
| 104 | +```math |
| 105 | +z - \mathrm{proj}_{\mathcal{Z}}(z) = D_2^{-1} \tilde{z} - \mathrm{proj}_{\tilde{\mathcal{Z}}}(D_2^{-1} \tilde{z}) = D_2^{-1} (\tilde{z} - \mathrm{proj}_{\tilde{\mathcal{Z}}}(\tilde{z})) |
| 106 | +``` |
| 107 | + |
| 108 | +```math |
| 109 | +c^\top x = (D_2^{-1} \tilde{c})^\top (D_2 \tilde{x}) = \tilde{c}^\top D_2^{-1} D_2 \tilde{x} = \tilde{c}^\top \tilde{x} |
| 110 | +``` |
| 111 | + |
| 112 | +```math |
| 113 | +\begin{align*} |
| 114 | +p(y; \ell_c, u_c) |
| 115 | +& = \ell_c^\top y^+ - u_c^\top y^- \\ |
| 116 | +& = (D_1^{-1} \tilde{\ell}_c)^\top (D_1 \tilde{y})^+ - (D_1^{-1} \tilde{u}_c)^\top (D_1 \tilde{y})^- \\ |
| 117 | +& = \tilde{\ell}_c^\top D_1^{-1} D_1 \tilde{y}^+ - \tilde{u}_c^\top D_1^{-1} D_1 \tilde{y}^- \\ |
| 118 | +& = \tilde{\ell}_c^\top \tilde{y}^+ - \tilde{u}_c \tilde{y}^- |
| 119 | +\end{align*} |
| 120 | +``` |
| 121 | + |
| 122 | +```math |
| 123 | +\begin{align*} |
| 124 | +p(z; \ell_v, u_v) |
| 125 | +& = \ell_v^\top z^+ - u_v^\top z^- \\ |
| 126 | +& = (D_2 \tilde{\ell}_v)^\top (D_2^{-1} \tilde{z})^+ - (D_2 \tilde{u}_v)^\top (D_2^{-1} \tilde{z})^- \\ |
| 127 | +& = \tilde{\ell}_v^\top D_2 D_2^{-1} \tilde{z}^+ - \tilde{u}_v^\top D_2 D_2^{-1} \tilde{z}^- \\ |
| 128 | +& = \tilde{\ell}_v^\top \tilde{z}^+ - \tilde{u}_v^\top \tilde{z}^- |
| 129 | +\end{align*} |
| 130 | +``` |
| 131 | + |
| 132 | +We make use of a few key observations: |
| 133 | + |
| 134 | +- Projection on $\mathcal{Z}$ commutes with scaling |
| 135 | +- Projection on an interval commutes with scaling if scaling is also applied to the interval in question |
0 commit comments