CLASSICAL THEORY OF MAXIMA AND MINIMA
To this point independent variables could take on any value. In actuality, the values of the independent variables are limited since they usually represent physical quantities such a flow rates, temperatures, pressures, process unit capacities and available resources. Consequently, there are constraints on variables, if nothing more than the fact that they must be non-negative. In many cases they are bounded within limits as dictated by the process equipment and related by equations such as material balances. The constraints on the variables can be of the form of equations and inequalities.
Methods to locate the stationary points of functions (economic models) subject to equality constraints (e.g., material and energy balance equations) will be developed, and examples illustrating the techniques will be given. Inequality constraints can be converted to equality constraints, and then these procedures for equality constraints can be applied with some additional considerations.
Let us illustrate the conversion of an inequality constraint to an equality constraint using a simple example to help visualize the concept of slack variables. Figure 2-3 is an example of an equality and an inequality constraint for a distillation column. The material balance which says that the feed rate to the column must equal the sum of the overhead and bottom products at steady state is the quality constraint. The upperlimit on the capacity of the distillation column, which was set when the equiptment was designed, is the inequality constraint. This inequality constraint can be converted to an equality constraint by adding a slack variable S as S2 to ensure a positive number has been added to the equation.
The term slack is used to represent the difference between the optimal and upper limit on the capacity. It represents the unused, excess, or slack in capacity of the process unit. For example, if Fopt= 30,000 barrels per day; then S2 = 20,000 barrels per day, a slack of 20,000 barrels per day; and the constraint is said to be loose, i.e., the inequality holds. If Fopt = 50,000 barrels per day then there is no slack, and the constraint is said to be tight, i.e., the equality holds. This will be discussed in more detail later in the chapter. Also, if there was a lower limit on F, e.g., F > 10,000, the same procedure would apply except S2 would be subtracted from F. The equation would be F - S2 = 10,000, and S is called a surplus variable.
We can now state a general optimization problem with n-independent variables and m equality constraints where the objective is to optimize (maximize or minimize) the economic model y(x) subject to m constraint equations fi (x).
There must be fewer equality constraints than independent variables to be able to optimize y(x), i.e., n > m. If m = n the values of the xj's are uniquely determined, and there is no optimization problem. Also if m > n, the problem is said to be overdetermined, because there are more equations than unknowns. There is no optimization problem for this case either.
There are three methods of locating the optimum points of the function y(x1,x2,...,xn) of n independent variables subject to m constraint equations fi(x1,x2,...,xn) = 0. These are: direct substitution, solution be constrained variation and method of Lagrange multipliers. We will find that direct substitution can not always be used, and the method of Lagrange multipliers will be the one most frequently employed.
This simply means to solve the constraint equations for the independent variables and to substitute the constraint equations directly into the function to be optimized. This will give an equation (economic model) with (n-m) unknowns, and the previous techniques for unconstrained optimization are applicable.
Unfortunately, it is not always possible to perform the algebraic manipulations required for these substitutions when the constraint equations are somewhat complicated. Consequently, it is necessary to resort to the following methods.
This method (3,14) is used infrequently, but furnishes a theoretical basis for important multivariable numerical search methods, such as the generalized reduced gradient. It is best illustrated for the case of two independent variables by considering the example shown in Figure 2-4. There is a local minimum of the constrained system at point a and a local maximum at point B. The maximum of the unconstrained system is at C.
At point A the curve y(x1,x2) = 1 and the curve f(x1,x2) = 0 are tangent and have the same slope. This means that differential changes, dx1 and dx2, produce the same change in the dependent variables y(x1,x2) and f(x1,x2). This can be expressed as:
We will need the total derivatives of y and f to combine with equation (2-16) to obtain the final result. Using the first terms in a Taylor series expansion for y and f gives:
At the minimum, point A, and the maximum, point B, dy is equal to zero; and the constraint is satisfied, i.e., f = 0 and df = 0.
Combining equations (2-16) and (2-17) gives the following result.
This is an algebraic equation, and it is to be solved in combination with the constraint equation to locate the stationary points. It should be remembered in this case that dy/dx1 and dy/dx2 are not necessarily zero. They are zero in the unconstrained case at point C, however.
This technique is illustrated with the following example. Then the extension to the general case for n independent variables will be given.
Find the stationary points of the following function using the method of constrained variation
The first partial derivatives are:
Substituting into equation (2-18) gives
for the values of the independent variables at the stationary points.
In general we are interested in finding the stationary points of a function y(x1, x2,..., xn) subject to m constraint equations fi(x1, x2, ..., xn) = 0 where i = 1, ...m, and n > m. The same reasoning applied in (n +1) dimensional space as applied to the three-dimensional space above, and this results in the following equations:
The Jacobian determinant for the first equation above is:
A total of n equations are solved for the stationary points, i.e., the (n-m) equation generated by (2-20) above and the m constraint equations. A derivation of these results is given by Beveridge and Schechter(6). This involves using Cramer's rule and eliminating the dxi's. Also similar results are given for this general case in the text by Wilde and Beightler (4). However, a different nomenclature is used, and the results are extended to include Lagrange Multipliers.
To illustrate the use of the Jacobian determinants, consider the following example, which obtains equation (2-18).
For this problem there are 2 independent variables (n=2) and one constraint (m=1), so the evaluation of one Jacobian determinant is required.
Expanding gives the following equation, gives:
This is the same as equation (2-18) which was solved with the constraint equation for the stationary point values of x1 and x2 in Example 2-3.
The most frequently used method for constraints is to employ Lagrange multipliers. The technique will be presented using two independent variables and one constraint equation to illustrate the concepts. Then the procedure will be extended to the general case of n independent variables and m constraint equation. For the case of two independent variables, we have:
We want to show how the Lagrange Multiplier arises and that the constrained problem can be converted into an unconstrained problem. The profit function and the constraint equation are expanded in a Taylor series. Then, using the first order terms gives:
This form of the constraint equation will be used to eliminate dx2 in the profit function. Solving for dx2 gives:
This equation is substituted into the equation for dy to obtain:
and rearranging gives:
Now we can define l as the value of [-dy / dx2 / df / dx2] at the stationary point of the constrained function. This ratio of partial derivatives l is a constant at the stationary point, and the above equation can be written as
At the stationary point dy = 0, and this leads to:
Now if L is defined as L = y + lf, the above gives:
This is one of the necessary conditions to locate the stationary points of an unconstrained function L which is constructed from the profit function y(x1,x2) and the constraint equation f(x1,x2) = 0. Now the same manipulations can be repeated to obtain the other necessary conditions:
Therefore, the constrained problem can be converted to an unconstrained problem by forming the Lagrangian, or augmented, function and solving this problem by the previously developed methods of setting the first partial derivatives equal to zero. This will give two equations to solve for the three unknowns x1, x2 and l at the stationary point. The third equation to be used is the constraint equation. In fact the Lagrange multiplier is sometimes treated as another variable since dL / dl = 0 gives the constraint equation. The example used for the method of constrained variation will be used to illustrate these ideas.
Find the stationary points for the following constrained problem using the method of Lagrange multipliers.
Solving the previous equations simultaneously gives the following stationary points:
The type of stationary points, i.e., maxima, minima or saddle points were determined by inspection for this problem. Sufficient conditions for constrained problems will be discussed subsequently in this chapter.
The development of the Lagrangian function of the case of n independent variables and m constraint equations is a direct extension from that of two independent variables and one constraint equation, and Avriel (10) gives a concise derivation of this result. (See problem 2-14) The Lagrangian, or augmented, function is formed as previously, and for every constraint equation there is a Lagrange multiplier. This is shown below:
The Lagrangian, or augmented, function is formed from the constrained problem as follows:
To locate the stationary points of the constrained problem, the first partial derivatives of the Lagrangian function with respect to the xj's and l i 's are set equal to zero (necessary conditions). There are (n + m) equations to be solved for the (n + m) unknowns: n - xj's and m - l i's.
It is sometimes said that the method of Lagrange multipliers requires more work than the method of constrained variation, since an additional m equations have to be solved for the values of the Lagrange multipliers. However, additonal and valuable information is obtained from knowing the values of the Lagrange multipliers, as will be seen. The following simple example gives a comparison among the three techniques.
For the process in Example 2-1 (2), it is necessary to maintain the product of the pressure and recycle ratio equal to 9000 psi. Determine the optimal values of the pressure and recycle ration and minimum cost within this constraint by direct substitution, constrained variation and Lagrange multipliers.
Again, the problem is to minimize C.
However, C is subject to the following constraint equation.
Direct Substitution: Solving the constraint above for P and substituting into the objective function gives:
Setting dC/dR = 0 and solving gives:
The corresponding cost is:
which is greater than the unconstrained system, as would be expected.
The first equation simplifies to:
which, when solved simultaneously with the second equation gives the same results as direct substitution.
Lagrange Multipliers: The Lagrangian, or augmented, function is:
Setting partial derivatives of L with respect to P, R, and l equal to zero gives:
Solving the above simultaneously gives the same results as the two previous methods and a value for the Lagrange Multiplier.
A further application of the method of Lagrange Multipliers is developing the method of steepest ascent (descent) for a function to be optimized. This result will be valuable when search methods are discussed.
To illustrate the direction of steepest ascent, a geometric representation is shown in Figure 2-5. To obtain the direction of steepest ascent, we wish to obtain the maximum value of dy, and y(x1,x2,...xn) is a function of n variables. Also, there is a constraint equation relating dx1, dx2, ... dxn and ds as shown in Figure 2-5 for two independent variables.
The problem is:
To obtain the maximum value of dy, the Lagrangian function is formed as follows:
Differentiating L with respect to the independent variables dxj and equating to zero gives:
and solving for dxj gives:
If a constant k is used to represent the term in the brackets in equation (2-29), this equation can be written as:
If a finite difference approximation is used for dxj = (xj - xjo) and ¶y/¶xj is evaluated at xo, then the following equation gives the gradient line.
This equation can be written vector notation in terms of the gradient of y evaluated at xo,y (xo), as:
Find the minimum along the direction of steepest descent of the function given below starting at the point xo = (1,1).
Gradient line (steepest descent):
or for two independent variables
Evaluating the partial derivatives at the starting point (1,1)
The gradient line is
Substituting the gradient line into the function to be minimized gives:
Computing dy/dk will locate the minimum along the gradient line, i.e.,
The corresponding values of x1 and x2 are
It turns out that the minimum along the gradient line is also the minimum for the function in this problem, because it is the sum of squares.
The method of steepest ascent is the basis for several search techniques which are described in Chapter 6, e.g., Steep Ascent Partan. It should be noted that when dealing with physical systems, the direction of steepest ascent (descent) may be only a direction of steep ascent (descent) depending on the scales used to represent the independent variables. This is discussed and illustrated by Wilde (5), and Wilde and Beightler (4).
The values of the Lagrange Multipliers at the optimum provide additional and important information. If the constraint equations are written with parameters bi on the right-hand side, the Lagrange multipliers give the change in the profit function with respect to these parameters, i.e., ¶y/¶bi. Many times, the right-hand sides of the constraint equations represent the availability of raw materials, demand for products, or capacities of process units. Consequently, it is frequently important to know how the optimal solution is affected by changes in availability, demand, and capacities. As we shall see, the Lagrange Multipliers are given the names "shadow prices" and "dual activity" in linear programming where these changes are analyzed by sensitivity analysis.
The following brief derivation obtains the result that ¶y/¶b = - l for the case of one constraint and two independent variables, and the extension to m constraint equations with n independent variables is comparable.
First, we can obtain the following equation from the profit function by the chain rule.
Also, we can obtain the next equation from the constraint equation written as f - b = 0 by the chain rule.
Then the equation from the constraint is multiplied by the Lagrange Multiplier and added to the equation from the profit function to give:
The values of ¶L/¶x1 and ¶L/¶x2 are zero at the stationary point (necessary conditions), and consequently ¶y/¶b = -l. Thus, the change in the profit function y with respect to the right-hand side of the constraint b is equal to the negative of the Lagrange Multiplier. Also, comparable results can be obtained for the case on n independent variables and m constraint equations to obtain the following result using a similar procedure and arguments(7).
In the next section, we will see that the Lagrange Multiplier is also a key factor in the analysis of problems with inequality constraints.
An additional complication arises when seeking the optimum value of a profit or cost function if inequality constraints are included. Although the same procedures are used, it will be necessary to consider two cases for each inequality constraint equation. One case is when the Lagrange Multiplier is zero, and the other is when the Lagrange Multiplier is not zero. This is best illustrated by the following example with one inequality constraint equation as shown below.
As described previously, the procedure is to add a slack variable xs as xs2 and form the Lagrangian function:
The result is two cases, i.e., either l = 0 and xs ¹ 0, or l ¹ 0 and xs = 0. If l = 0 and xs ¹ 0 the inequality holds, and the constraint is said to be "loose", "passive" or "inactive". If l ¹ 0 and xs = 0 then the equality holds, and the constraint is said to be "tight" or "active". The following example illustrates this situation using a modification of the previous simple process.
For the process the cost function is:
However, C is subject to the inequality constraint equation.
Adding the slack variable S, as S2, and forming the Lagrangian function gives:
Setting the first partial derivatives of L with respect to P, R, S, and l equal to zero gives the following four equations:
The two cases are l ¹ 0, S = 0 and l = 0, S ¹ 0. For the case of l ¹ 0, S = 0 the equality PR = 9000 holds, i.e., the constraint is active. This was the solution obtained in Example 2-6, and the results were:
For the case of l = 0, S ¹ 0, the constraint is an inequality, i.e., inactive. This was the solution obtained in Example 2-2 and the results were:
The example above had only one inequality constraint and two cases to consider. However, with several inequality constraints locating the stationary points can become time-consuming, for the possibilities must be searched exhaustively. A procedure for this evaluation has been given by Cooper (7) and Walsh (8) as follows:
The above procedure applies, assuming that the stationary point located is a maximum or a minimum of the constrained problem. However, there is a possibility that several stationary points will be located; some could be maxima, some minima, and others saddle points. In Problem 2.6, four stationary points were found; two are maxima, one a minimum, and one a saddle point. Also, from equation (2-40) for each inequality constraint where the strict inequality holds, the slack variable is positive, and the Lagrange Multiplier is zero. For each inequality constraint where the equality holds, the slack variable is zero, and the Lagrange Multiplier is not zero.
In the next section necessary and sufficient conditions for constrained problems are described to determine the character of stationary points. This will be similar to and an extension of the previous discussion for unconstrained problems.