Chapter2

CLASSICAL THEORY OF MAXIMA AND MINIMA

 
Necessary and Sufficient Conditions for Constrained Problems top

       The necessary conditions have been developed by Kuhn and Tucker (14) for a general nonlinear optimization problem with equality and inequality constraints. This optimization problem written in terms of minimizing y(x) is:

Minimize:       y(x)                                        (2-41)

Subject to:    fi(x) 0 for i = 1, 2, ..., h         (2-42)

fi(x) = 0 for i = h+1, ..., m        (2-43)

where y(x) and fi(x) are twice continuously differentiable real valued functions.

     Any value of x that satisfies the constraint equations (2-42) and (2-43) is called a feasible solution to the problem in the Kuhn-Tucker theory. Then to locate points that can potentially be local minima of equation (2-41) and satisfy the constraint equations (2-42) and (2-43), the Kuhn-Tucker necessary conditions are used. These conditions are written in terms of the Lagrangian function for the problem which is:

Equ2-44.jpg (7634 bytes)

where the xn+i's are the surplus variables used to convert the inequality constraints to equalities.

The necessary conditions for a constrained minimum are given by the following theorem (7,8,10,14).

In order to minimize y(x) subject to fi(x) 0, i = 1,2, ..., h and fi(x) = 0, i = h+1, ..., m, the necessary conditions for the existence of a relative minimum at x* are:

Equ2-45.jpg (26594 bytes)

     Examining these conditions, the first one is setting the first partial derivatives of the Lagrangian function with respect to the independent variables x1, x2, ..., xn equal to zero to locate the Kuhn-Tucker point, x*. The second and third conditions are repeating the inequality and equality constraint equations which must be satisfied at the inequality and equality constraint equations which must be satisfied at the minimum of the constrained problem. The fourth conditions is another way of expressing l ixn+i = 0, i = 1, 2, ..., h from setting the partial derivatives of the Lagrangian function with respect to the surplus variables equal to zero. Either li 0 and xn+i = 0 (constraint is active) or l i = 0 and xn+i 0 (constraint is inactive). Thus, the product of the Lagrange Multiplier and the constraint equation set equal to zero is an equivalent statement, and this is called the complementary slackness condition (15). The fifth condition comes from examining equation (2-37), i.e., dy(x*)/dbi = - l i. The argument is that as bi is increased the constraint region is enlarged; and this cannot result in a higher value for y(x*), the minimum in the region. However, it could result in a lower value of y(x*); and correspondingly dy(x*)/dbi would be negative, i.e., as bi increases, y(x*) could decrease. Therefore, if dy(x*)/dbi is negative, then the Lagrange Multiplier l i must be positive for equation (2-37) to be satisfied. This condition is called a constraint qualification, as will be discussed subsequently. For the sixth condition, it has been shown by Bazaraa and Shetty (15) that the Lagrange multipliers associated with the equaltiy constraints are unrestricted in sign; and there is not an argument comparable to the one given above for the Lagrange multipliers associated with the inequality constraints.

For the problem of maximizing y(x) subject to inequality and equality constraints, the problem is as follows:

Maximize:      y(x)                                                (2-46)

Subject to :   fi(x) < 0 for i = 1, 2, ..., h               (2-47)

           fi(x) = 0 for i = h+1, ..., m             (2-48)


For this problem the Kuhn-Tucker conditions are:

Equ2-49.jpg (26825 bytes)

These conditions are the same as the ones for minimizing given by equation (2-45), except the inequality is reversed for the Lagrange Multipliers in the fifth condition. Also, the inequality constraints are written as less than or equal to zero for convenience in the subsequent discussion on sufficient conditions.

The following example illustrates the Kuhn-Tucker necessary conditions for a simple problem.

 

Example 2-9:

Locate the five Kuhn-Tucker points of the following problem, and determine their character, i.e., maximum, minimum, or saddle point.

optimize:         y = x1x2

subject to:       x1 + x2 < 1

-x1 + x2 < 1

-x1 - x2 < 1

  x1 - x2 < 1

A diagram of the above equations is given in Figure 2-6. The function being optimized is the classic saddle point function which in constrained by planes.

The first step in the procedure is to locate the stationary points by ignoring the inequality constraints, i.e., l1 = l2 = l3 = l4 = 0. If this point satisfies the constraints as inequalities, an optimum may have been found. For this problem:

Ex2-9a.jpg (4334 bytes)

The Kuhn-Tucker point is xo (0,0), and evaluating its character by the unconstrained sufficiency conditions gives the following result.

Ex2-9b.jpg (10764 bytes)

The point xo(0,0) is a saddle point, and the constraints are satisfied.

Proceeding to step two, one constraint equation at a time is selected, and the character of the Kuhn-Tucker point is determined. Beginning with the first constraint equation as an equality, i.e., l 1 0 and considering the other three as inequalities, i.e., l2 = l3 = l4 = 0, gives the following equation for the Lagrangina function

Ex2-9c.jpg (11926 bytes)

Solving gives:

x1 = ,            x2 = ,            l = -             y(, ) =

The sign of the Lagrange Multiplier is negative, and by the Kuhn-Tucker necessary conditions, the point can be a maximum, since the other constraint equations are satisfied as inequalities.

The procedure is repeated for the other three constraint equations, each considered individually as an equality. The results for the Kuhn-Tucker points are summarized in the following table:

y :             -            -       0

x1:             -      -            0

x2:                   -     -      0

l:       -            -            0

Character:   max     min    max    min   saddle

The character of each of the stationary points is based on the Kuhn-Tucker necessary conditions. By examining Figure 2-5, we confirm their character. However, constraint qualifications and sufficient conditions are needed to give a general method, and this is discussed next.

     In the theorems developed by Kuhn and Tucker (14), the constraint equations must satisfy certain conditions at the Kuhn-Tucker points, and these conditions are called constraint qualifications. As given in Bazaraa and Shetty (15), there are several forms of constraint qualifications; and one, according to Gill et. al. (16), is important for nonlinear constraints. This is the condition that the gradients of the constraint equations at the Kuhn-Tucker point are linearly independent. This constraint qualification is required for the necessary conditions given by equations (2-45) and (2-49). As an example, Kuhn and Tucker(14) constructed the constraint equations:

f1 = (1 - x1)3 - x2 > 0,

f2 = x1 > 0,

f3 = x2 > 0

These do not satisfy this condition at point x2* = 1 and x2* = 0. At this point f1 = [-3(1 - x1)2, -1] = (0,-1), f2 = (1,0) and f3 = (0,1) are not linearly independent. At such a point as this one, the necessary condition may fail to hold, and Kuhn and Tucker (14) give arguments that this constraint qualification is required to ensure the existence of the Lagrange multipliers at the optimum point. Verification of the constraint qualifications for a general nonlinear programming problem is almost an impossible task according to Avriel (10). He states that fortunately in practice constraint qualification usually holds, and it is justifiable to use the existence of the Lagrange mulitpliers as a basis for having the necessary conditions hold.

The same concepts used for unconstrained problems are followed to develop the sufficient conditions for constrained problems. This involves expanding the Lagrangian function in a Taylor series about the Kuhn-Tucker point located using the necessary conditions. The Taylor series is simplified by neglecting third and higher order terms to give a function that contains only terms involving second partial derivatives evaluated at the Kuhn-Tucker point. This gives a differential quadratic form, and a test similar to the one for the unconstrained problem is obtained to determine if the Kuhn-Tucker point is a maximum, minimum, or saddle. The sufficient conditions for the case of both inequality and equality constraints are more elaborate than if only equality constraints are involved. We have space to give only the appropriate theorems and describe their development and use. Further details are given by Avriel(10), Bazaraa and Shetty(15) and Reklaitis, et. al.(17).

Considering the case of only equality constraints first, the Lagrangian function for n independent variables and m equality constraint equations is given by the following equation.

Equ2-50.jpg (5159 bytes)

Expanding the Lagrangian function in a Taylor series about the Kuhn-Tucker point x* gives:

Equ2-51.jpg (14419 bytes)

This equation is comparable to equation (2-8), and subscripts xi, xj, and xk indicate partial differentiation with respect to those variables. Again, the first partial derivatives are zero at the Kuhn-Tucker point by the necessary conditions, and x is selected sufficiently close to x* such that the higher order terms are negligible when compared to the second order terms. This gives the following equation, which is comparable to equation (2-9) for the unconstrained case.

Equ2-52a.jpg (8427 bytes)

As previously, we need to determine if the term in the brackets remains positive (minimum), remains negative (maximum) or changes sign (saddle point) for small feasible changes in x about x*. The term in the bracket is a differential quadratic form.

To determine if the quadratic form is always positive or always negative, results comparable to those given by equation (2-7) are required, with the extension that the constraints also be satisfied, i.e., for feasible values of x. A theorem is given by Avriel (10) which establishes these conditions, and this theorem is then applied to the differential quadratic form of the Lagrangian function. The result, after Avriel (10), is the following theorem for the sufficient conditions of the optimization problem with only equality constraints. In this theorem the second partial derivatives of the Lagrangian function evaluated at the Kuhn-Tucker point x* are Lxj xk (x*,l* ) and are written as Ljk. Also, first partial derivatives of the constraint equations evaluated at the Kuhn-Tucker point x* are fj(x*)/xk and are written fjk

Let y(x) and fi(x) = 0, i = 1, 2, ..., m, be twice continuously differentiable real valued functions. If there exist vectors x* and l *, such that:

Equ2-52b.jpg (15299 bytes)


for p = m+1, .., n, then y(x*) has a strict local minimum at x*, such that:

fi(x*) = 0,        i = 1, 2, ..., m

The proof of this theorem is given by Avriel (10), and follows the discussion of the concepts given above. The comparable result for a strict maxima is obtained by changing (-1)m in the above theorem to (-1)p, according to Avriel (10). The following example illustrates the application of this theorem.

 

Example 2-10

Consider the following problem.

optimize:         x12 + 2x22 + 3x32

subject to:       x1 + 2x2 + 4x3 - 12 = 0

2x1 + x2 + 3x3 - 10 = 0

Forming the Lagrangian function and differentiating with respect to x1, x2, x3, l1, and l2 gives the following set of equations to be solved for the Kuhn-Tucker point.

Lx1 = 2x1 + l1 + 2l2 = 0

Lx2 = 4x2 + 2l1 + l2 = 0

Lx3 = 6x3 + 4l1 + 3l2 = 0

Ll1 = x1 + 2x2 + 4x3 - 12 = 0

Ll2 = 2x1 + x2 + 3x3 - 10 = 0

 

Solving the above equation set simultaneously gives the following values for the Kuhn-Tucker point.

x1 = 112/81,         x2 = 118/81,         x3 = 52/27,         l1 = -80/27,        l2 = 8/81

From the necessary conditions of equations (2-45) or (2-49), the Lagrange multipliers are unrestricted in sign, and the value of the determinants from the theorem on sufficiency conditions is required to determine the character of this point. The partial derivatives needed for this evaluation are:

L11 = 2        L12 = 0        L13 = 0

L21 = 0        L22 = 4        L23 = 0

L31 = 0        L32 = 0        L33 = 6

f11 = 1         f12 = 2         f13 = 4

f21 = 2         f22 = 1         f23 = 3

The determinant is (m = 2, n = 3, p = 3, only one in this case):

Ex2-10a.jpg (9008 bytes)

The value of D3 is positive, and Kuhn-Tucker point is a minimum.

     The sufficient conditions for problems with equality and inequality constraints, equations (2-41), (2-42), and (2-43), are summarized in the following theorem. There are a number of mathematical concepts and theorems required to obtain this result. These are given in some detail by Avriel (10), Bazaraa and Shetty (15) and Reklaitis, et. al.(17); it is not feasible to describe them in the space available here.

Let y(x), fi(x) 0, i = 1, 2, ..., h and fi(x) = 0, i = h+1, ...,m be twice continuously differentiable real-valued functions. If there exist vectors x* and l * satisfying

Ex2-10b.jpg (38632 bytes)

The following example illustrates the application of this theorem.

 

Example 2-11

Consider the following problem after Reklaitis et. al.(17).

Minimize:    (x1 - 1)2 + x22

Subject to : -x1 + x22 > 0

Applying the theorem gives:

2(x1 -1) + l = 0

2x2 - 2x2l = 0

l(-x1 + x22) = 0

l > 0

Solving this set of equations gives x1 = 0, x2 = 0, l = 2 for the Kuhn-Tucker point. Then applying the sufficient conditions gives the following results at x* = (0,0).

2z12 - 2z22 > 0

However, for all finite values of z, the above inequalities cannot be satisfied, and the second-order sufficiency conditions show that the point is not a minimum.

     In summary, the necessary and sufficient conditions for nonlinear programming problems have been described and illustrated with examples. References have been given for more details for this theory.

     An important special case is when the economic model is concave, and all of the constraint equation are convex and are inequalities. This is known as convex programming. "A function is concave if linear interpolation between its values at any two points of definition yields a value not greater than its actual value at the point of interpolation; such a function is the negative of a convex function" according to Kuhn and Tucker (14). Thus, the convex programming problem can be stated as follows.

Maximize:   y(x)                                            (2-53)

Subject to:  fi(x) < 0      for i = 1, 2, ..., m      (2-54)

The necessary and sufficient conditions for the maximum of concave function y(x) subject to convex constraints fi(x) < 0, i = 1, 2, ..., m are the Kuhn Tucker conditions given below as:

Equ2-55.jpg (12314 bytes)

The theorem from Cooper(7) that establishes the above result is:

If y(x) is a strictly concave function and fi(x), i = 1, 2, ..., m are convex functions which are continuous and differentiable, then the Kuhn-Tucker conditions (2-49) are sufficient as well as necessary for a global maximum.

The proof of this theorem uses the definition of convex and concave functions and the fact that the Lagrangian function can be formulated as the sum of concave functions, which is concave.

     These concepts and results for the Kuhn-Tucker conditions and those given previously will be valuable in our discussion of modern optimization procedures in the following chapters. Those interested in further theoretical results are referred to the references at the end of the chapter and Chapter 6. Also, in industrial practice we will see that the concepts from the Kuhn-Tucker conditions are used in computer programs for advanced multivariable search methods to optimize economic and process models which are too elaborate for the algebraic manipulations required to use these theories directly.

top