SINGLE VARIABLE SEARCH TECHNIQUES
With a sequential search, it is possible to make use of previous information to locate subsequent experiments. The result is a significantly larger reduction in the final interval of uncertainty for the same number of experiments. The most efficient one of the sequential search plans is called Fibonacci search, from the famous sequence of numbers that appears in the equations of this procedure. This minimax method requires that the number of experiments be specified in advance, which may be inconvenient. Almost as efficient as Fibonacci search is golden section search, and this method does not require that the number of experiments be specified in advance. A third, related search plan, lattice search, is designed for functions that are defined only at discrete points. These methods, which place experiments one at a time, will be developed in the following paragraphs. However, the theory is available to place any number of experiments simultaneously in an interval and repeat this placement sequentially. These methods are called even block, odd block, and golden block search. Their description and minimax proofs are given by Wilde and Beightler (2) and Avriel (3). It is important to know that these other methods are available for use, but they will not be discussed here, for their application to the solution of industrial problems has been limited.
This search technique is considered to be the best one of the minimax methods. It has the largest interval reduction of all of the procedures, but it requires that the number of experiments be specified in advance. The approach to developing this algorithm will be after that of Wilde (1) and Wilde and Beightler (2), where a dynamic programming approach is used. The derivation begins by placing the last two experiments optimally in the interval preceding the final interval of uncertainty. This is a simultaneous search with two experiments. Then the development determines the location of each preceding experiment to arrive at the location of the first two experiments. In Figure 5-8 the locations of the last two experiments, xn and xn-1, are shown. They are placed symmetrically about the center of the interval preceding the final interval and are separated by a distance equal to the resolution, dIn (= eIo). The final interval is In, the one preceding it is In-1, and d is the fractional resolution based on the final interval of uncertainty. For convenience, at the start of the derivation one end of the interval In-1 is the left boundary of the initial interval, Io, and the other end must be the location of experiment xn-2, i.e., y(xn-1) > y(xn-2) for maximizing. Evaluating the results of experiments xn and xn-1 will determine the location of the final interval, but not its length. The length of In is determined by the minimax simultaneous search of uniform pairs given by equation (5-10) for two experiments. This equation can be written as follows:
We can now locate experiment xn-3 by considering that had y(xn-1) < y(xn-2), then xn would have been located to the right of xn-2 by a distance equal to dIn for maximizing. This is shown in Figure 5-8, and experiment xn-3 has to be located a distance of In-1 to the right of xn-1. In addition, the final interval, In, could be located at any one of the four positions shown for In in Figure 5-8, depending on the outcomes of xn-2, xn-1 and xn. We can locate experiment xn-4 by considering that had y(xn-2) < y(xn-3), then xn-4 would have to be located to the right of xn-3 by a distance equal to In-2. This is shown in Figure 5-8 and insures that the final interval is In. This also shows that the interval In-2 is the sum of the final interval In and In-1, i.e.:
This equation can be combined with equation (5-31) to give
This reasoning can be repeated to locate points xn-5, xn-6,..., and the following results will be obtained, i.e., an interval is equal to the sum of the two following intervals.
Repeated substitution of the equations determined like equation (5-34) give the following results in terms of In for the equations (5-35).
The generalization of equation (5-35) is:
and the generalization of equation (5-36) is:
The Aj's are a famous sequence of numbers dating back to 1202 (see Wilde (1) for details), and are called the Fibonacci numbers. The sequence can be generated by the following equations:
which gives the first twelve values of the Fibonacci numbers. We are now in a position to obtain the equation to calculate the length of the final interval of uncertainty, In, knowing the length of the initial interval, Io, and to determine the placement of the first two experiments. The value of n - j = 1 is used in equation (5-38) to determine I1 as given below.
However, it is more convenient to have the equations in terms of the resolution based on the initial interval, eIo. The substitution of eIo = dIn is made, and I1 is equal to Io, because two experiments are required for interval reduction. Equation 5-40 becomes:
Thus, knowing the initial interval, Io, the fractional resolution based on the initial interval, e, and the number of experiments, n, we determine the length of the final interval of uncertainty, In, by equation (5-41). The location of this final interval is determined by the outcome of the experiments. The location of the first two experiments can now be determined using equation (5-38). Both of the first two experiments are needed to determine the part of the interval that is to be retained for further experiments. The first experiments are placed according to n-j = 2 or j = n-2, and substituting into equation (5-38) gives.
Again, it is convenient to have the above equations in terms of the initial interval Io and the corresponding resolution e. Making the substitution for dIn = eIo and recognizing that x2 can be located a distance I2 from the left hand side of the interval gives:
The location of x2 is shown in Figure 5-8, along with the location of x1. The position for x1 is symmetrically to x2 a distance I2 from the right-hand side of the interval. The position for x1 can be computed by having n-j = 3 in equation (5-38). The result is:
The procedure continues after the first two experiments are evaluated by discarding the interval that does not contain the optimum. The third experiment is placed in the interval symmetrically to the one in the interval. Evaluating this experiment permits discarding another section of the interval that does not contain the optimum. The procedure is continued by placing the remaining experiments symmetrically to the previous one with the best value until the final experiment is placed. This final experiment will be symmetrical in the final interval with the previous one with the best value, and they will be separated by a distance equal to the resolution, eIo = dIn. The following example illustrates the procedure using the function of Example 5-1. Example 5-2
The classical theory solution was given in Example 5-1, and the optimum is located at x = 3/4 with a value of y(3/4) = 5.25. This is compared the best value of 5.246 located at x = 0.78 with four sequential experiments using Fibonacci search. A rapid interval reduction is obtained with Fibonacci search. This can be illustrated with a simplified form of equation (5-41) with e = 0 which is: For 11 experiments the initial interval is reduced by 144, because A12 = 144. This rapid reduction in the interval containing the optimum, along with the simple procedure of placing experiments symmetrically in the interval to continue the search, and its basis in the minimax principle, has made Fibonacci search one of the most widely used procedures. Table 5-1 gives a program for Fibonacci search, along with the input from Example 5-2 and the corresponding output. The search technique to be discussed next is a modification of Fibonacci search. Golden section search sacrifices being a minimax technique to relax the requirement that the number of experiments be specified in advance. This program performs a Fibonacci search to locate the maximum of the function given in the function FUNC in the interval between LBOUND and HBOUND. The input to this program is in free format, and it consists of the resolution desired (RESOL), the number of experiments (EXPNO), and the low and high bounds (LBOUND, HBOUND). The output includes the value of the function and the bounds on the interval of uncertainty for each application of the algorithm. The input and output shown with the program are the results obtained using the problem in example 5-2. To maximize a new function, supply the equation in the function FUNCT.
This method is used when the number of experiments is not known in advance. The search starts at essentially the same place as Fibonacci search when more than about six Fibonacci experiments are specified, i.e., x1 and x2 are located at the same position, x1 = 0.382, x2 = 0.618, based on a unit initial interval. The idea is to keep the procedure of placing experiments symmetrically in the interval to rapidly reduce the section that contains the optimum. To do this equation (5-37) is used, and the ratio of successive intervals is maintained constant. Dividing equation (5-37) by In-(j-2) gives: The ratio of successive intervals t is defined as: Using the following relation This equation can be combined with equation (5-46) to give:
The solution of this quadratic equation is given by Wilde(1) as:
To begin the search, the second (or first) experiment is located using equation (5-47) at:
and the first (or second) experiment is located symmetrically in the interval as:
and after n experiments the final interval of uncertainty is: The following example illustrates the procedure using the same function in the prior examples.
Example 5-3
A rapid interval reduction was obtained with golden section search also. For eleven experiments the initial interval would be reduced by 123, compared with 144 for Fibonacci search. Both techniques have been widely used in industrial applications. In the next section we will briefly describe another extension of Fibonacci search, called lattice search, which applies to functions which are defined only at discrete values of the independent variables.
This method has been developed for the problem where the independent variable takes on only discrete values. Thus, it is necessary to have a search method that searches on these discrete values. Examples could be the determination of the optimum route for an aircraft or the optimum number of temperature or pressure measuring devices for a process. The approach is to modify the Fibonacci search technique such that each experiment falls on a discrete value of the independent variable as the search proceeds. This means that the first two experiments must start on discrete values of the independent variable, called lattice points, and the final two experiments are adjacent to each other at lattice points. Beginning with equation (5-41) it is simplified for e = 0, i.e., the resolution is zero. This gives the following equation.
For this equation Io is the number of lattice points plus one, and In is equal to one to have the last two experiments adjacent to each other. However, these conditions overdetermine equation (5-54), and another approach is as follows. To have In be equal to one, the number of experiments is determined by selecting the Fibonacci number, An+1, which is equal to or greater Io plus one. Usually, the Fibonacci number will not match the comparable value of Io plus one, and this value will have to be increased by adding fictitious points to Io. These points are usually added to either the left or right end of the interval. However, they can be located in any convenient place as long as they are not one of the points that must be evaluated during the search. To start the search, equation (5-43) is used with e = 0, i.e.:
Using equation (5-54) to have this equation in terms of Io gives:
Consequently, the second experiment is located at the lattice point corresponding to An. The first experiment is located symmetrically to x2 in the initial interval. Then the function is evaluated at the two points, and the third and subsequent experiments are placed symmetrically in the interval remaining. The last two experiments will differ by one, and the optimum lattice point will have been located. The following example illustrates the procedure using the function of the previous examples and precision points. Example 5-4
A further discussion of this procedure is given by Wilde(1). Now our attention will turn to extending the use of these methods on an open interval in the next section.
Many multivariable search techniques search on a line in space as they attempt to move from a starting point through better points to arrive at an optimum, as will be seen in the next chapter. This requires a single-variable search, where one end of the interval is known, and the other is to be found in order to have the optimum bounded in the interval. The approach is to attempt to place two experiments along the line of search to bound the optimum and, at the same time, have the experiments be in the correct position to continue a Fibonacci or golden section search. The compromise is one of attempting to select an interval that is large enough to ensure that the optimum is bounded, but no so large that excessive effort is required to reduce the interval to an acceptable final value. Also, if the optimum is not bounded with two experiments, provision can be made to have a third experiment placed such that it is in the proper position to continue the search. The second experiment would be used with the third one for the two required in the search technique. Once the optimum is bounded, then the previously described procedures are used to reduce the size of the interval to an acceptable value. In addition to illustrating the above procedure for bounding an optimum in an open interval, we want to illustrate how this procedure can be used in a multivariable search procedure. The gradient line developed in Chapter 2 will be used for this purpose. A line in space can be represented in vector notation by the following equation.
where a is called the parameter of the line. The components of this vector equation can be written using the components of x, and known points a and b as:
For the value of a = 0, x is equal to a and for a = 1, x is equal to b. Values of a between zero and one generate values of x between a and b, values of a greater than one generate values of x beyond b, and values of a less than zero generate values of x beyond a. The gradient line was given by equation (2-32) and is rewritten here using the plus sign to have the direction of steep ascent as:
In this case values of the parameter a greater than zero generate points x in the direction of steep ascent. The problem of maximizing y(x), a multivariable function, is converted to maximizing y[xo + aÑy(xo)], a function of a only, along the line of steep ascent starting at point xo. Consequently, there is an open interval along the gradient line beginning at xo, and it is necessary to bound the maximum by selecting two values of a. Then a Fibonacci or golden section search can be conducted on a to determine the location of final interval and values of y(x) as previously described. The following simple example illustrates the use of Fibonacci search in placing experiments on a line in space. Example 5-5
Golden section search will be used to illustrate the procedures to bound the optimum on an open interval. In this case it will not be necessary to specify the number of experiments, and a stopping criteria can be used based on the computed values of yn+1 and yn, e.g., y[xo + an+1 Ñy(xo)] - y[xo + anÑy(xo)] < s, where s is a specified small number. The first two experiments are located using equations (5-51) and (5-52), which are:
The problem is one of specifying Io sufficiently large that y(x1) > y(x2) for maximizing to have the maximum bounded between zero and x2. If this does not happen, then x2 will have to become x1, and a third experiment will have to be placed. The location of this experiment is given by the following equation obtained using the two previous equations.
This should result in y(x2) > y(x3), have the maximum bounded and have x2 and x3 in the proper location to continue the search on the interval x1 < x* < x3. Then golden section search can be conducted placing experiments symmetrically in the interval until the stopping criteria is reached. Wilde and Beightler (2) have an expanded discussion of this procedure giving the generalization of equation (5-60) for h measurements using Fibonacci search. Their conclusion is "that underestimation of the reduction ratio needed costs twice as many extra measurements as overestimation by the same amounts." The following example will illustrate the procedure for an unbounded function represented graphically. Example 5-6
There are several nonminimax techniques that have been used successfully for single-variable optimization. They involve fitting a polynomial to the profit function and computing the stationary points of the polynomial iteratively until a satisfactory optimum has been obtained. The polynomial is usually quadratic or at most cubic, and procedures can be incorporated to bound the optimum when the interval is open. These procedures will be reviewed, and further descriptions are given by Avriel(3), Beveridge and Schechter(6), and Himmelblau(7). To have a quadratic fit of the profit (or cost) function, the coefficients of the function are evaluated by making three measurements. The quadratic equation can be written as:
The stationary point of the above equation is obtained by setting the first derivative equal to zero to give:
The values of a and b are determined by measurements at three points to determine y(x1), y(x2), and y(x3). Thus equation (5-61) can be written as:
which are three linear algebraic equations in a, b, and c. In matrix form they can be written as: This set of equations can be solved for a and b, and the results are substituted into equations (5-62) to compute x*, the stationary point of the quadratic approximation to the profit function. The result is: The procedure involves selecting three initial points and computing the optimum of the quadratic approximation, x*, by equation (5-65). Typically, the procedure continues using x* as one of the next three points to repeat the application of equation (5-65). The point with the smallest value of y (for maximizing) is discarded. The procedure is repeated until a tolerance on the dependent variable is met, e.g., yn+1 - yn < s, where s is a small number that is specified. However, there is a possibility that the procedure can oscillate around the optimum. In this case logic has to be incorporated into the computer program that recognizes when yn+1 < yn (when maximizing) and takes corrective action. An algorithm which incorporates these procedures, called Powell's method, is described by Himmelblau(7). The technique is generally called the quadratic method(3). A complimentary procedure to be used with the quadratic method for an open interval to bound the optimum is described by Himmelblau (7) also. It is called the DSC method and is a logical procedure that takes an initial step, Dx, from the starting point to evaluate the function. It continues the search using increments 2Dx, 4Dx, 8Dx,..., until the optimum is bounded. These points provide the data for a second-order interpolation to estimate x*. However, Himmelblau(7) states that instead of using a second order interpolation, it is better to shift to the quadratic method once the optimum has been bounded to have a more efficient algorithm. A simple example illustrating the use of these algorithms is given by Himmelblau(7) and Beightler, et al. (4). Also Wilde and Beightler (2) give the equation for a cubic approximation to that profit function. It should be mentioned that the Armijo line search (11) has been used successfully with the multivariable search method, successive quadratic programming (12). This line search does not have minmax properties, but has been shown to converge to the minimum of a continuous function. Also, a sequence for the parameter of the gradient line has been shown to converge to the minimum of a continuous function. Further details are given in the article by Armijo (11) for these theorems and in the article by Beigler and Cuthrell (12) for applications with the successive quadratic programming algorithm for optimization of process simulators. |