Lecture 12
October 7, 2024
Text: VSRIKRISH to 22333
Linear programming refers to optimization for linear models.
As we will see over the next few weeks, linear models are:
“Program” used to refer to military logistics, which is the origin of this field of research. For this historical reason, “mathematical programming” is often used instead of “mathematical optimization.”
We typically restrict “program” for optimization problems which are formulated completely mathematically, versus we use a computer model to simulate the relationship between decision variables and outputs.
Recall that a function \(f(x_1, \ldots, x_n)\) is linear if \[f(x_1, \ldots, x_n) = a_1x_1 + a_2x_2 + \ldots + a_n x_n.\]
The key is that linear models are very simple geometrically:
A linear program (LP), or a linear optimization model, has the following characteristics:
All solutions must exist on the boundary of the feasible region (which must be a polytope).
More specifically:
This is the basis of all simplex methods for solving LPs.
These methods go back to George Dantzig in the 1940s and are still widely used today.
Can a solution be in the interior?
What about along an edge but not a corner?
\[\begin{alignedat}{3} & \max_{x_1, x_2} & 230x_1 + 120x_2 & \\\\ & \text{subject to:} & &\\ & & 0.9x_1 + 0.5x_2 &\leq 600 \\ & & x_1 + x_2 &\leq 1000 \\ & & x_1, x_2 &\geq 0 \end{alignedat}\]
x1 = 0:1200
x2 = 0:1400
f1(x) = (600 .- 0.9 .* x) ./ 0.5
f2(x) = 1000 .- x
p = plot(0:667, min.(f1(0:667), f2(0:667)), fillrange=0, color=:lightblue, grid=true, label="Feasible Region", xlabel=L"x_1", ylabel=L"x_2", xlims=(-50, 1200), ylims=(-50, 1400), framestyle=:origin, minorticks=4, right_margin=4mm, left_margin=4mm)
plot!(0:667, f1.(0:667), color=:brown, linewidth=3, label=false)
plot!(0:1000, f2.(0:1000), color=:red, linewidth=3, label=false)
annotate!(400, 1100, text(L"0.9x_1 + 0.5x_2 = 600", color=:purple, pointsize=18))
annotate!(1000, 300, text(L"x_1 + x_2 = 1000", color=:red, pointsize=18))
plot!(size=(600, 500))
\[\begin{alignedat}{3} & \max_{x_1, x_2} & 230x_1 + 120x_2 & \\\\ & \text{subject to:} & &\\ & & 0.9x_1 + 0.5x_2 &\leq 600 \\ & & x_1 + x_2 &\leq 1000 \\ & & x_1, x_2 &\geq 0 \end{alignedat}\]
Manually checking the corner points is all well and good for this simple example, but does it scale well?
LP solvers (often based off the simplex method) automate this process.
Linear models come up frequently because we can linearize nonlinear functions.
When we linearize components of an mathematical program, this is called the linear relaxation of the original problem.
E = 0:0.01:1
plot(E, E.^2, legend=false, grid=false, xlabel="Efficiency", ylabel="Cost", color=:black, yticks=false, xlims=(0, 1), ylims=(0, 1), left_margin=8mm, linewidth=3)
xticks!([0.65, 0.95])
xlims!((0, 1.05))
scatter!([0.65, 0.95], [0.65, 0.95].^2, markersize=10, color=:blue)
plot!(E, 1.6 .* E .- 0.6175, color=:blue, linestyle=:dash, linewidth=3)
plot!(size=(600, 500))
A solution will be found at one of the corner points of the feasible polytope.
This means that at this solution, one or more constraints are binding: if we relaxed the constraint by weakening it, we could improve the solution.
The marginal cost of a constraint is the amount by which the solution would improve if the constraint capacity was relaxed by one unit.
This is also referred to as the shadow price (these are also called the dual variables of the constraint).
Non-zero shadow prices tell us that the constraint is binding, and their values rank which constraints are most influential.
Lagrange Multipliers are a way to incorporate equality constraints into an “unconstrained” form for an optimization problem.
\[\begin{align*} \max_{x,y}\qquad &f(x,y) \\ \text{subject to:}\qquad &g(x,y) = 0 \end{align*}\]
Then: \[\mathcal{L}(x, y, \lambda) = f(x,y) - \lambda g(x,y)\]
Original Problem
\[ \begin{aligned} & \min &&f(x_1, x_2) \notag \\\\ & \text{subject to:} && x_1 \geq A \notag \\ & && x_2 \leq B \notag \end{aligned} \]
With Dummy Variables
\[ \begin{aligned} & \min &&f(x_1, x_2) \notag \\\\ & \text{subject to:} && x_1 - S_1^2 = A \notag \\ & && x_2 + S_2^2 = B \notag \end{aligned} \]
Then the Lagrangian function becomes:
\[ \mathcal{L}(\mathbf{x}, S_1, S_2, \lambda_1, \lambda_2) = f(\mathbf{x}) - \lambda_1(x_1 - S_1^2 - A) - \lambda_2(x_2 + S_2^2 - B) \]
where \(\lambda_1\), \(\lambda_2\) are penalties for violating the constraints.
The \(\lambda_i\) are the eponymous Lagrange multipliers.
Next step: locate possible optima where the partial derivatives of the Lagrangian are zero.
\[\frac{\partial \mathcal{L}(\cdot)}{\partial \cdot} = 0\]
This is actually many equations, even though our original problem was low-dimensional, and can be slow to solve.
The shadow prices are the Lagrange Multipliers of the optimization problem.
If our inequality constraints \(X \geq A\) and \(X \leq B\) are written as \(X - S_1^2 = A\) and \(X + S_2^2 = B\):
\[\mathcal{L}(X, S_1 S_2, \lambda_1, \lambda_2) = Z(X) - \lambda_1(X - S_1^2 - A) - \lambda_2(X + S_2^2 - B),\]
\[\Rightarrow \qquad \frac{\partial \mathcal{L}}{\partial A} = \lambda_1, \qquad \frac{\partial \mathcal{L}}{\partial B} = \lambda_2.\]
Wednesday: Prelim 1!
Next Monday: Fall Break, no class
Next Wednesday: Optimization Lab (come prepared, link will be posted on Ed)