Chapter2

CLASSICAL THEORY OF MAXIMA AND MINIMA

 Analytical Methods Applicable for Constraints top

To this point independent variables could take on any value. In actuality, the values of the independent variables are limited since they usually represent physical quantities such a flow rates, temperatures, pressures, process unit capacities and available resources. Consequently, there are constraints on variables, if nothing more than the fact that they must be non-negative. In many cases they are bounded within limits as dictated by the process equipment and related by equations such as material balances. The constraints on the variables can be of the form of equations and inequalities.

Methods to locate the stationary points of functions (economic models) subject to equality constraints (e.g., material and energy balance equations) will be developed, and examples illustrating the techniques will be given. Inequality constraints can be converted to equality constraints, and then these procedures for equality constraints can be applied with some additional considerations.

Let us illustrate the conversion of an inequality constraint to an equality constraint using a simple example to help visualize the concept of slack variables. Figure 2-3 is an example of an equality and an inequality constraint for a distillation column. The material balance which says that the feed rate to the column must equal the sum of the overhead and bottom products at steady state is the quality constraint. The upperlimit on the capacity of the distillation column, which was set when the equiptment was designed, is the inequality constraint.  This inequality constraint can be converted to an equality constraint by adding a slack variable S as S2 to ensure a positive number has been added to the equation.

F + S2 = 50,000                                                                 (2-13)

The term slack is used to represent the difference between the optimal and upper limit on the capacity. It represents the unused, excess, or slack in capacity of the process unit. For example, if Fopt= 30,000 barrels per day; then S2 = 20,000 barrels per day, a slack of 20,000 barrels per day; and the constraint is said to be loose, i.e., the inequality holds. If Fopt = 50,000 barrels per day then there is no slack, and the constraint is said to be tight, i.e., the equality holds. This will be discussed in more detail later in the chapter. Also, if there was a lower limit on F, e.g., F > 10,000, the same procedure would apply except S2 would be subtracted from F. The equation would be F - S2 = 10,000, and S is called a surplus variable.

We can now state a general optimization problem with n-independent variables and m equality constraints where the objective is to optimize (maximize or minimize) the economic model y(x) subject to m constraint equations fi (x).

optimize:      y(xi,x2,...xn)                                                    (2-14)

subject to:   fi(x1,x2,...,xn) = 0                                             (2-15)

for i - 1,2,...m

There must be fewer equality constraints than independent variables to be able to optimize y(x), i.e., n > m. If m = n the values of the xj's are uniquely determined, and there is no optimization problem. Also if m > n, the problem is said to be overdetermined, because there are more equations than unknowns. There is no optimization problem for this case either.

There are three methods of locating the optimum points of the function y(x1,x2,...,xn) of n independent variables subject to m constraint equations fi(x1,x2,...,xn) = 0. These are: direct substitution, solution be constrained variation and method of Lagrange multipliers. We will find that direct substitution can not always be used, and the method of Lagrange multipliers will be the one most frequently employed.

 Direct Substitution top

This simply means to solve the constraint equations for the independent variables and to substitute the constraint equations directly into the function to be optimized. This will give an equation (economic model) with (n-m) unknowns, and the previous techniques for unconstrained optimization are applicable.

Unfortunately, it is not always possible to perform the algebraic manipulations required for these substitutions when the constraint equations are somewhat complicated. Consequently, it is necessary to resort to the following methods.

 Constrained Variation top

This method (3,14) is used infrequently, but furnishes a theoretical basis for important multivariable numerical search methods, such as the generalized reduced gradient. It is best illustrated for the case of two independent variables by considering the example shown in Figure 2-4. There is a local minimum of the constrained system at point a and a local maximum at point B. The maximum of the unconstrained system is at C.

At point A the curve y(x1,x2) = 1 and the curve f(x1,x2) = 0 are tangent and have the same slope.  This means that differential changes, dx1 and dx2, produce the same change in the dependent variables y(x1,x2) and f(x1,x2). This can be expressed as:

We will need the total derivatives of y and f to combine with equation (2-16) to obtain the final result. Using the first terms in a Taylor series expansion for y and f gives:

At the minimum, point A, and the maximum, point B, dy is equal to zero; and the constraint is satisfied, i.e., f = 0 and df = 0.

Combining equations (2-16) and (2-17) gives the following result.

This is an algebraic equation, and it is to be solved in combination with the constraint equation to locate the stationary points. It should be remembered in this case that dy/dx1 and dy/dx2 are not necessarily zero. They are zero in the unconstrained case at point C, however.

This technique is illustrated with the following example. Then the extension to the general case for n independent variables will be given.

Example 2-3

Find the stationary points of the following function using the method of constrained variation

optimize: y(x) = x1 x2

subejct to: f(x) = x21 + 22 - 1 = 0

The first partial derivatives are:

Substituting into equation (2-18) gives

x2 · 2x2 - x1 · 2x1 =  0   or    x22 - x12 =  0

which is solved simultaneously with the constraint equation

x12 + x22 - 1 = 0

The result is:

x1 = + (½)½        and        x2 = + (½)½

for the values of the independent variables at the stationary points.

In general we are interested in finding the stationary points of a function y(x1, x2,..., xn) subject to m constraint equations fi(x1, x2, ..., xn) = 0 where i = 1, ...m, and n > m. The same reasoning applied in (n +1) dimensional space as applied to the three-dimensional space above, and this results in the following equations:

 The set of equations given in equation (2-19) above can be solved for (n-m) equations to go with the m constraint equations to locate the stationary points. The (n-m) equations corresponding to equation 2-18 of the two independent variable case can be written in terms of (n-m) Jacobian determinants which are:

The Jacobian determinant for the first equation above is:

A total of n equations are solved for the stationary points, i.e., the (n-m) equation generated by (2-20) above and the m constraint equations. A derivation of these results is given by Beveridge and Schechter(6). This involves using Cramer's rule and eliminating the dxi's. Also similar results are given for this general case in the text by Wilde and Beightler (4). However, a different nomenclature is used, and the results are extended to include Lagrange Multipliers.

To illustrate the use of the Jacobian determinants, consider the following example, which obtains equation (2-18).

Example 2-4

optimize: y(x1,x2)

subject to: f(x1, x2) = 0

For this problem there are 2 independent variables (n=2) and one constraint (m=1), so the evaluation of one Jacobian determinant is required.

Expanding gives the following equation, gives:

This is the same as equation (2-18) which was solved with the constraint equation for the stationary point values of x1 and x2 in Example 2-3.

 Lagrange Multipliers top

The most frequently used method for constraints is to employ Lagrange multipliers. The technique will be presented using two independent variables and one constraint equation to illustrate the concepts. Then the procedure will be extended to the general case of n independent variables and m constraint equation. For the case of two independent variables, we have:

Optimize:       y(x1, x2)                                                    (2-22)

Subject to:     f(x1, x2) = 0

We want to show how the Lagrange Multiplier arises and that the constrained problem can be converted into an unconstrained problem. The profit function and the constraint equation are expanded in a Taylor series. Then, using the first order terms gives:

This form of the constraint equation will be used to eliminate dx2 in the profit function. Solving for dx2 gives:

This equation is substituted into the equation for dy to obtain:

and rearranging gives:

Now we can define l as the value of [-dy / dx2 / df / dx2] at the stationary point of the constrained function. This ratio of partial derivatives l is a constant at the stationary point, and the above equation can be written as

At the stationary point dy = 0, and this leads to:

Now if L is defined as L = y + lf, the above gives:

This is one of the necessary conditions to locate the stationary points of an unconstrained function L which is constructed from the profit function y(x1,x2) and the constraint equation f(x1,x2) = 0. Now the same manipulations can be repeated to obtain the other necessary conditions:

Therefore, the constrained problem can be converted to an unconstrained problem by forming the Lagrangian, or augmented, function and solving this problem by the previously developed methods of setting the first partial derivatives equal to zero. This will give two equations to solve for the three unknowns x1, x2 and l at the stationary point. The third equation to be used is the constraint equation. In fact the Lagrange multiplier is sometimes treated as another variable since dL / dl = 0 gives the constraint equation. The example used for the method of constrained variation will be used to illustrate these ideas.

Example 2-5:

Find the stationary points for the following constrained problem using the method of Lagrange multipliers.

Optimize:         y(x) = x1x2

Subject to:       f(x) = x12 + x22 - 1 = 0

The Lagrangian, or augmented, function is formed as shown below.

L(x1, x2, l) = x1x2   +  l (x12 + x22 - 1)

The following equations are obtained from setting the first partial derivatives equal to zero.

L/x1 = x2 + 2lx1 = 0

L/x2 = x1 + 2lx2 = 0

L/¶l = x12 + x22 - 1 = 0

Solving the previous equations simultaneously gives the following stationary points:

maxima : x1 = (½)½ ,  x2 = (½)½ ,   l = -½

x1 = -(½)½ ,  x2 = -(½)½ , l = - ½

minima : x1 = (½)½ ,  x2 = -(½)½l = ½

x1 = -(½)½,  x2 = (½)½l = ½

The type of stationary points, i.e., maxima, minima or saddle points were determined by inspection for this problem. Sufficient conditions for constrained problems will be discussed subsequently in this chapter.

The development of the Lagrangian function of the case of n independent variables and m constraint equations is a direct extension from that of two independent variables and one constraint equation, and Avriel (10) gives a concise derivation of this result. (See problem 2-14) The Lagrangian, or augmented, function is formed as previously, and for every constraint equation there is a Lagrange multiplier. This is shown below:

optimize: y(x)                   x = (x1, x2 ,..., xn)T                                            (2-23)

subject to: fi(x) = 0           for i = 1,2,...,m

where n > m

The Lagrangian, or augmented, function is formed from the constrained problem as follows:

To locate the stationary points of the constrained problem, the first partial derivatives of the Lagrangian function with respect to the xj's and l i 's are set equal to zero (necessary conditions). There are (n + m) equations to be solved for the (n + m) unknowns: n - xj's and m - l i's.

It is sometimes said that the method of Lagrange multipliers requires more work than the method of constrained variation, since an additional m equations have to be solved for the values of the Lagrange multipliers. However, additonal and valuable information is obtained from knowing the values of the Lagrange multipliers, as will be seen. The following simple example gives a comparison among the three techniques.

Example 2-6

For the process in Example 2-1 (2), it is necessary to maintain the product of the pressure and recycle ratio equal to 9000 psi. Determine the optimal values of the pressure and recycle ration and minimum cost within this constraint by direct substitution, constrained variation and Lagrange multipliers.

Again, the problem is to minimize C.

C = 1000P + 4 x 109/PR + 2.5 x 105R

However, C is subject to the following constraint equation.

PR = 9000

Direct Substitution: Solving the constraint above for P and substituting into the objective function gives:

C = 9 x 106/R + (4/9) x 106 + 2.5 x 105R

Setting dC/dR = 0 and solving gives:

R = 6 and P = 1500 psi.

The corresponding cost is:

C = 3.44 x 106

which is greater than the unconstrained system, as would be expected.

Constrained Variation: The equations to be solved for this case are:

The first equation simplifies to:

P = 250R

which, when solved simultaneously with the second equation gives the same results as direct substitution.

Lagrange Multipliers: The Lagrangian, or augmented, function is:

L = 1000P + 4 x 109/PR + 2.5 x 105R + l(PR - 9000)

Setting partial derivatives of L with respect to P, R, and l equal to zero gives:

1000 - 4 x 109/P2R + lR = 0

2.5 x 105 - 4 x 109/PR2 + lP = 0

PR - 9000 = 0

Solving the above simultaneously gives the same results as the two previous methods and a value for the Lagrange Multiplier.

P = 1500,      R = 6,     l = -117.3

 Method of Steepest Ascent top

A further application of the method of Lagrange Multipliers is developing the method of steepest ascent (descent) for a function to be optimized. This result will be valuable when search methods are discussed.

To illustrate the direction of steepest ascent, a geometric representation is shown in Figure 2-5. To obtain the direction of steepest ascent, we wish to obtain the maximum value of dy, and y(x1,x2,...xn) is a function of n variables. Also, there is a constraint equation relating dx1, dx2, ... dxn and ds as shown in Figure 2-5 for two independent variables.

The problem is:

To obtain the maximum value of dy, the Lagrangian function is formed as follows:

Differentiating L with respect to the independent variables dxj and equating to zero gives:

These n equation are solved simultaneously with the constraint equation for the values of dxj and . Solving for gives:

and solving for dxj gives:

The term in the brackets is not a function of j, and consequently dxj is proportional to y/xj. The positive sign indicates the direction of steepest ascent and the negative sign indicates the direction of steepest descent.

If a constant k is used to represent the term in the brackets in equation (2-29), this equation can be written as:

If a finite difference approximation is used for dxj = (xj - xjo) and y/xj is evaluated at xo, then the following equation gives the gradient line.

This equation can be written vector notation in terms of the gradient of y evaluated at xo,y (xo), as:

x = xo + kÑy ( xo )                                                                                  (2-32)

If the positive sign is used, then movement is along the line in the direction of steepest ascent, and if the negative sign is used then movement is along the line in the direction of steepest descent. The following example illustrates the use of the method of steepest descent on a simple function.

Example 2-7:

Find the minimum along the direction of steepest descent of the function given below starting at the point xo = (1,1).

y = x12 + x22

Gradient line (steepest descent):

x = xo - kÑy(xo)

or for two independent variables

Evaluating the partial derivatives at the starting point (1,1)

The gradient line is

x1 = 1 - 2k

x2 = 1 - 2k

Substituting the gradient line into the function to be minimized gives:

y  =  (1 - 2k)2   +  (1 - 2k)2  =  2(1 - 2k)2

Computing dy/dk will locate the minimum along the gradient line, i.e.,

The corresponding values of x1 and x2 are

x1 = 1 - 2(½)   = 0                    x2 = 1 - 2(½) = 0

It turns out that the minimum along the gradient line is also the minimum for the function in this problem, because it is the sum of squares.

The method of steepest ascent is the basis for several search techniques which are described in Chapter 6, e.g., Steep Ascent Partan. It should be noted that when dealing with physical systems, the direction of steepest ascent (descent) may be only a direction of steep ascent (descent) depending on the scales used to represent the independent variables. This is discussed and illustrated by Wilde (5), and Wilde and Beightler (4).

 Economic Interpretation of the Lagrange Multipliers top

The values of the Lagrange Multipliers at the optimum provide additional and important information. If the constraint equations are written with parameters bi on the right-hand side, the Lagrange multipliers give the change in the profit function with respect to these parameters, i.e., y/bi. Many times, the right-hand sides of the constraint equations represent the availability of raw materials, demand for products, or capacities of process units. Consequently, it is frequently important to know how the optimal solution is affected by changes in availability, demand, and capacities. As we shall see, the Lagrange Multipliers are given the names "shadow prices" and "dual activity" in linear programming where these changes are analyzed by sensitivity analysis.

The following brief derivation obtains the result that y/b = - l for the case of one constraint and two independent variables, and the extension to m constraint equations with n independent variables is comparable.

optimize:    y(x1, x2)                                                                                 (2-33)

subject to:  f(x1, x2) = b

First, we can obtain the following equation from the profit function by the chain rule.

Also, we can obtain the next equation from the constraint equation written as f - b = 0 by the chain rule.

Then the equation from the constraint is multiplied by the Lagrange Multiplier and added to the equation from the profit function to give:

The values of L/x1 and L/x2 are zero at the stationary point (necessary conditions), and consequently y/b = -l. Thus, the change in the profit function y with respect to the right-hand side of the constraint b is equal to the negative of the Lagrange Multiplier. Also, comparable results can be obtained for the case on n independent variables and m constraint equations to obtain the following result using a similar procedure and arguments(7).

In the next section, we will see that the Lagrange Multiplier is also a key factor in the analysis of problems with inequality constraints.

 Inequality Constraints top

An additional complication arises when seeking the optimum value of a profit or cost function if inequality constraints are included. Although the same procedures are used, it will be necessary to consider two cases for each inequality constraint equation. One case is when the Lagrange Multiplier is zero, and the other is when the Lagrange Multiplier is not zero. This is best illustrated by the following example with one inequality constraint equation as shown below.

Optimize:       y(x)                                                                                     (2-38)

Subject to:     f(x) £ 0

As described previously, the procedure is to add a slack variable xs as xs2 and form the Lagrangian function:

L(x, l) = y(x) + l [ f(x) + xs2 ]                                                                 (2-39)

Then the first partial derivatives with respect to the xi's, xs, and l are set equal to zero to have a set of equations to be solved for the stationary points. To illustrate the complication, the equation obtained for the slack variable is:

The result is two cases, i.e., either l = 0 and xs ¹ 0, or l ¹ 0 and xs = 0. If l = 0 and xs ¹ 0 the inequality holds, and the constraint is said to be "loose", "passive" or "inactive". If l ¹ 0 and xs = 0 then the equality holds, and the constraint is said to be "tight" or "active". The following example illustrates this situation using a modification of the previous simple process.

Example 2-8:

For the process the cost function is:

C = 1000P + 4 x 109/PR + 2.5 x 105 R

However, C is subject to the inequality constraint equation.

PR < 9000

Adding the slack variable S, as S2, and forming the Lagrangian function gives:

L  =  1000P  +   4 x 109/PR  +  2.5 x 105 R  +  l(PR + S2 - 9000)

Setting the first partial derivatives of L with respect to P, R, S, and l equal to zero gives the following four equations:

The two cases are l ¹ 0, S = 0 and l = 0, S ¹ 0. For the case of l ¹ 0, S = 0 the equality PR = 9000 holds, i.e., the constraint is active. This was the solution obtained in Example 2-6, and the results were:

C = \$3.44 x 106 per year              P = 1500 psi           R = 6           l = -117.3

For the case of l = 0, S ¹ 0, the constraint is an inequality, i.e., inactive. This was the solution obtained in Example 2-2 and the results were:

C = \$3.0 x 106 per year                 P = 1000psi            R = 4           S = (5000)½

The example above had only one inequality constraint and two cases to consider. However, with several inequality constraints locating the stationary points can become time-consuming, for the possibilities must be searched exhaustively. A procedure for this evaluation has been given by Cooper (7) and Walsh (8) as follows:

1. Solve the problem of optimizing: y(x), ignoring the inequality constraints, i.e., having all positive slack variables. Designate this solution xo. If xo satisfies the constraints as inequalities, an optimum has been found.

2. If one or more constraints are not satisfied, select one of the constraints to be an equality, i.e., active (the slack variable for this constraint is zero), and solve the problem. Call this solution x1. If x1 satisfies all of the constraints, an optimum has been found.

3. If one or more constraints are not satisfied, repeat step 2 until every inequality constraint has been treated as an equality constraint (slack variable being zero) in turn.

4. If step 3 did not yield an optimum, select combinations of two inequality constraints at a time to be equalities, and solve the problem. If one of these solutions satisfies all of the constraints, an optimum has been found.

5. If step 4 did not yield an optimum, select combinations of three inequality constraints at a time to be equalities, and solve the problem. If one of these solutions satisfies all of the constraints, an optimum has been found. If not try combinations of four inequality constraints at a time to be equalities, etc.

The above procedure applies, assuming that the stationary point located is a maximum or a minimum of the constrained problem. However, there is a possibility that several stationary points will be located; some could be maxima, some minima, and others saddle points. In Problem 2.6, four stationary points were found; two are maxima, one a minimum, and one a saddle point. Also, from equation (2-40) for each inequality constraint where the strict inequality holds, the slack variable is positive, and the Lagrange Multiplier is zero. For each inequality constraint where the equality holds, the slack variable is zero, and the Lagrange Multiplier is not zero.

In the next section necessary and sufficient conditions for constrained problems are described to determine the character of stationary points. This will be similar to and an extension of the previous discussion for unconstrained problems.

top