SciMLBase.jl/src/problems/optimization_problems.jl at 6f7e8c63f379bec94de147e6ec78dff5dbccead0 · SciML/SciMLBase.jl · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
@enum ObjSense MinSense MaxSense

@doc doc"""

Defines an optimization problem.
Documentation Page: [https://docs.sciml.ai/Optimization/stable/API/optimization_problem/](https://docs.sciml.ai/Optimization/stable/API/optimization_problem/)

## Mathematical Specification of an Optimization Problem

To define an optimization problem, you need the objective function ``f``
which is minimized over the domain of ``u``, the collection of optimization variables:

```math
min_u f(u,p)
```

``u₀`` is an initial guess for the minimizer. `f` should be specified as `f(u,p)`
and `u₀` should be an `AbstractArray` whose geometry matches the
desired geometry of `u`. Note that we are not limited to vectors
for `u₀`; one is allowed to provide `u₀` as arbitrary matrices /
higher-dimension tensors as well.

## Problem Type

### Constructors

```julia
OptimizationProblem{iip}(f, u0, p = SciMLBase.NullParameters(),;
                        lb = nothing,
                        ub = nothing,
                        lcons = nothing,
                        ucons = nothing,
                        sense = nothing,
                        kwargs...)
```

`isinplace` optionally sets whether the function is in-place or not.
This is determined automatically, but not inferred. Note that for OptimizationProblem,
in-place refers to the objective's derivative functions, the constraint function
and its derivatives. `OptimizationProblem` currently only supports in-place.

Parameters `p` are optional, and if not given, then a `NullParameters()` singleton
will be used, which will throw nice errors if you try to index non-existent
parameters.

`lb` and `ub` are the upper and lower bounds for box constraints on the
optimization variables. They should be an `AbstractArray` matching the geometry of `u`,
where `(lb[i],ub[i])` is the box constraint (lower and upper bounds) for `u[i]`.

`lcons` and `ucons` are the upper and lower bounds in case of inequality constraints on the
optimization and if they are set to be equal then it represents an equality constraint.
They should be an `AbstractArray`, where `(lcons[i],ucons[i])`
are the lower and upper bounds for `cons[i]`.

The `f` in the `OptimizationProblem` should typically be an instance of [`OptimizationFunction`](https://docs.sciml.ai/Optimization/stable/API/optimization_function/#optfunction)
to specify the objective function and its derivatives either by passing
predefined functions for them or automatically generated using the [ADType](https://github.com/SciML/ADTypes.jl).

If `f` is a standard Julia function, it is automatically transformed into an
`OptimizationFunction` with `NoAD()`, meaning the derivative functions are not
automatically generated.

Any extra keyword arguments are captured to be sent to the optimizers.

### Fields

* `f`: the function in the problem.
* `u0`: the initial guess for the optimization variables.
* `p`: Either the constant parameters or fixed data (full batch) used in the objective or a [MLUtils](https://github.com/JuliaML/MLUtils.jl) [`DataLoader`](https://juliaml.github.io/MLUtils.jl/stable/api/#MLUtils.DataLoader) for minibatching with stochastic optimization solvers. Defaults to `NullParameters`.
* `lb`: the lower bounds for the optimization variables `u`.
* `ub`: the upper bounds for the optimization variables `u`.
* `int`: integrality indicator for `u`. If `int[i] == true`, then `u[i]` is an integer variable.
    Defaults to `nothing`, implying no integrality constraints.
* `lcons`: the vector of lower bounds for the constraints passed to [OptimizationFunction](https://docs.sciml.ai/Optimization/stable/API/optimization_function/#optfunction).
    Defaults to `nothing`, implying no lower bounds for the constraints (i.e. the constraint bound is `-Inf`)
* `ucons`: the vector of upper bounds for the constraints passed to [`OptimizationFunction`](https://docs.sciml.ai/Optimization/stable/API/optimization_function/#optfunction).
    Defaults to `nothing`, implying no upper bounds for the constraints (i.e. the constraint bound is `Inf`)
* `sense`: the objective sense, can take `MaxSense` or `MinSense` from Optimization.jl.
* `kwargs`: the keyword arguments passed on to the solvers.

## Inequality and Equality Constraints

Both inequality and equality constraints are defined by the `f.cons` function in the [`OptimizationFunction`](https://docs.sciml.ai/Optimization/stable/API/optimization_function/#optfunction)
description of the problem structure. This `f.cons` is given as a function `f.cons(u,p)` which computes
the value of the constraints at `u`. For example, take `f.cons(u,p) = u[1] - u[2]`.
With these definitions, `lcons` and `ucons` define the bounds on the constraint that the solvers try to satisfy.
If `lcons` and `ucons` are `nothing`, then there are no constraints bounds, meaning that the constraint is satisfied when `-Inf < f.cons < Inf` (which of course is always!). If `lcons[i] = ucons[i] = 0`, then the constraint is satisfied when `f.cons(u,p)[i] = 0`, and so this implies the equality constraint `u[1] = u[2]`. If `lcons[i] = ucons[i] = a`, then ``u[1] - u[2] = a`` is the equality constraint.

Inequality constraints are then given by making `lcons[i] != ucons[i]`. For example, `lcons[i] = -Inf` and `ucons[i] = 0` would imply the inequality constraint ``u[1] <= u[2]`` since any `f.cons[i] <= 0` satisfies the constraint. Similarly, `lcons[i] = -1` and `ucons[i] = 1` would imply that `-1 <= f.cons[i] <= 1` is required or ``-1 <= u[1] - u[2] <= 1``.

Note that these vectors must be sized to match the number of constraints, with one set of conditions for each constraint.

## Data handling

As described above the second argument of the objective definition can take a full batch or a [`DataLoader`](https://juliaml.github.io/MLUtils.jl/stable/api/#MLUtils.DataLoader) object for mini-batching which is useful for stochastic optimization solvers. Thus the data either as an Array or a `DataLoader` object should be passed as the third argument of the `OptimizationProblem` constructor.
For an example of how to use this data handling, see the `Sophia` example in the [Optimization.jl documentation](https://docs.sciml.ai/Optimization/dev/optimization_packages/optimization) or the [mini-batching tutorial](https://docs.sciml.ai/Optimization/dev/tutorials/minibatch/).
"""
struct OptimizationProblem{iip, F, uType, P, LB, UB, I, LC, UC, S, K} <:
       AbstractOptimizationProblem{iip}
    f::F
    u0::uType
    p::P
    lb::LB
    ub::UB
    int::I
    lcons::LC
    ucons::UC
    sense::S
    kwargs::K
    @add_kwonly function OptimizationProblem{iip}(
            f::Union{OptimizationFunction{iip}, MultiObjectiveOptimizationFunction{iip}},
            u0,
            p = NullParameters();
            lb = nothing, ub = nothing, int = nothing,
            lcons = nothing, ucons = nothing,
            sense = nothing, kwargs...) where {iip}
        if xor(lb === nothing, ub === nothing)
            error("If any of `lb` or `ub` is provided, both must be provided.")
        end
        warn_paramtype(p)
        new{iip, typeof(f), typeof(u0), typeof(p),
            typeof(lb), typeof(ub), typeof(int), typeof(lcons), typeof(ucons),
            typeof(sense), typeof(kwargs)}(f, u0, p, lb, ub, int, lcons, ucons, sense,
            kwargs)
    end
end

function OptimizationProblem(
        f::Union{OptimizationFunction, MultiObjectiveOptimizationFunction},
        args...; kwargs...)
    OptimizationProblem{isinplace(f)}(f, args...; kwargs...)
end
function OptimizationProblem(f, args...; kwargs...)
    OptimizationProblem(OptimizationFunction(f), args...; kwargs...)
end

function OptimizationFunction(
        f::NonlinearFunction, adtype::AbstractADType = NoAD(); kwargs...)
    if isinplace(f)
        throw(ArgumentError("Converting NonlinearFunction to OptimizationFunction is not supported with in-place functions yet."))
    end
    OptimizationFunction((u, p) -> sum(abs2, f(u, p)), adtype; kwargs...)
end

function OptimizationProblem(
        prob::NonlinearLeastSquaresProblem, adtype::AbstractADType = NoAD(); kwargs...)
    if isinplace(prob)
        throw(ArgumentError("Converting NonlinearLeastSquaresProblem to OptimizationProblem is not supported with in-place functions yet."))
    end
    optf = OptimizationFunction(prob.f, adtype; kwargs...)
    return OptimizationProblem(optf, prob.u0, prob.p; prob.kwargs..., kwargs...)
end

isinplace(f::OptimizationFunction{iip}) where {iip} = iip
isinplace(f::OptimizationProblem{iip}) where {iip} = iip

@doc doc"""

Holds information on what variables to alias
when solving an OptimizationProblem. Conforms to the AbstractAliasSpecifier interface.
    `OptimizationAliasSpecifier(;alias_p = nothing, alias_f = nothing, alias_u0 = false, alias = nothing)`

### Keywords
* `alias_p::Union{Bool, Nothing}`
* `alias_f::Union{Bool, Nothing}`
* `alias_u0::Union{Bool, Nothing}`: alias the u0 array. Defaults to false .
* `alias::Union{Bool, Nothing}`: sets all fields of the `OptimizationAliasSpecifier` to `alias`

"""
struct OptimizationAliasSpecifier <: AbstractAliasSpecifier
    alias_p::Union{Bool, Nothing}
    alias_f::Union{Bool, Nothing}
    alias_u0::Union{Bool, Nothing}

    function OptimizationAliasSpecifier(;
            alias_p = nothing, alias_f = nothing, alias_u0 = nothing, alias = nothing)
        if alias == true
            new(true, true, true)
        elseif alias == false
            new(false, false, false)
        elseif isnothing(alias)
            new(alias_p, alias_f, alias_u0)
        end
    end
end