DYNAMIC PROGRAMMING
Branched systems can have either converging or diverging branches. A feed-forward loop (by-pass) is a special case of a diverging branch, and a feed-back loop (recycle) is a special case of a converging branch. The discussion will begin with diverging branches which is the simpler of the two cases to describe but not necessarily to optimize.
The functional diagram of a diverging branch is given in Figure 7-15. The branch consists of stages 1' through m', and these stages have the following transition functions, incident identities and return function.
The maximum return for the diverging branch is: To find the return, fm'(sm') requires the solution of an initial value serial problem which is "easily" done. If the final value of the diverging branch is specified, decision inversion is performed, and stage 1' is combined with stage 2'. Then one-stage, one-decision partial optimizations are continued to stage m'. To connect the branch to the main system at stage k, the following transition functions, incident identities, and dynamic programming algorithm are used.
This can be combined with the transition functions to give an algorithm in dk and sk only. This equation shows that there is a one-state, one-decision partial optimization at stage k. It is referred to as absorption of a diverging branch. The partial optimization can then proceed to stage N to complete the solution. A special case of a diverging branch is a feed-forward loop, which is shown schematically in Figure 7-16. The approach is to convert this structure into a diverging branch and solve it as described previously. The loop enters at stage j, and the transition and return functions for this stage are:
At this point a
value of s2j = This is a one-state, one-decision partial optimization at stage j, because s2j is fixed. The value of fj-1 is available from partial optimizations from stage 1 to stage j-1. The partial optimizations can now proceed to stage k along the main section and along the loop. When it arrives at stage k, the dynamic programming algorithm is: where fk(sk) is determined
for the best value of dk and a cut state value of s2j,
i.e., a value for s2j is picked which will be used as the first
point of a single variable search. Then, it is necessary to return to
stage j to select a new value of s2j and repeat the
partial optimizations to obtain a new set of fk(sk).
These are compared with the previous set for best values. This procedure
is continued using a single variable search to locate the best value of
s2j =
The functional diagram for a converging branch system is given in Figure 7-17. The branch consists of stages 1' through m'. One approach to optimize this system is to perform state inversion on the branch, and this will convert the system to one with a diverging branch. Unfortunately, this may be difficult to accomplish. There is another approach that is slightly more complicated to describe, but is easier to implement. First the dynamic programming algorithm at stage k can be written as: It includes the optimal return from the converging branch fm'(sm'). This algorithm treats s2k as a decision variable. A two variable search on dk and s2k is used to maximize the right hand side of the equation. The maximum return
from the branch is obtained by cutting the state between the branch and
stage k where s2k = A special case of a converging branch is a feedback loop which is shown in Figure 7-18, and the loop consists of stages 1' through m'. It turns out that the feedback loop will be converted to a converging branch during the partial optimization along the main branch. Conducting partial optimizations from stage 1 to stage i, the dynamic programming algorithm at stage i is: and the transition functions for this stage are:
At stage i,
the output Proceeding with the partial optimization to stage j, the dynamic programming algorithm at this stage is the same as for the converging branch, equation (7-35). Using the transition function:
then the dynamic programming algorithm equation (7-35), becomes: The values of fm'(sm'),
the optimal return from the feed-back loop, are determined by treating
s2j = sm' as a decision variable. The feed-back
loop is a two-point boundary value problem, since sm' is fixed,
and The following example illustrates some of the methods that have been described. It was developed by Professor D. J. Wilde (6) for his optimization class at Stanford University. Example 7-5 A manufacturing process is arranged as shown in Figure 7-19 along with the operating conditions and costs for each unit. Raw material which comes in two variations, A and B, is successively heated and purified before it is sent to a reactor where it undergoes transformation into impure products. The impure products are cleaned up in a "finisher", and the final products are sold. A new heating-purifying train has just been built, and it is operated in parallel with the old one. The new unit processes the same amount of material as the old unit, which is then mixed and pumped to the reactor. The impurity content is averaged in the mixer. After the reaction and finishing steps, two grades of product can be marketed, "colossal" and "stupendous". We wish to find the maximum profit and the corresponding optimum policy for the plant operations. It is not necessary for the old and new heating-purifying train to use the same raw material. The problem involves a converging branch with a choice of raw materials for input values on the branch and on the main system. The functional diagram for the branch and main system is shown in Figure 7-20. At stage one, the optimal decisions are shown for various values of the state variable using the dynamic programming algorithm. At stage two the dynamic programming algorithm is: A two variable search is required to locate the best values of s22 on the branch and d*2 at stage 2 that maximize the return from stage, R2, the main system, f1, and the branch, f4', for various values of the state variable s12. Decision inversion is required for the branch and these results are shown at stages 3' and 4' for cut state values of 0.1% and 0.5%. The minimum cost to operate the branch is -9 for an impurities content of 0.1% and -8 for 0.5%. The information is now available to complete the partial optimization at stage 2. To illustrate this procedure consider the case of s12 = 0.3% impurities content and a cut state value of s22 = 0.1%, and values of the decision variable, d2, equal to high, medium and low. This procedure is repeated over a range of values of s12 searching on s22 and d2. The results obtained for stage two are shown in Figure 7-20. The partial optimizations are continued to stages three and four. The results are shown in Figure 7-20. The maximum profit of 13 is obtained using B as feed to the old heater and A to the new heater. The optimal policy can be read from the values on Figure 7-20. Also, another optimal solution is shown for the case of both heaters having the same feed. This is A, and the maximum profit is 11 for this case.
The previous methods to apply dynamic programming to systems with loops and branches were developed by Aris, Nemhauser and Wilde(7). In a previous article, Mitten and Nemhauser(5) outlined the steps to use dynamic programming. Although this procedure should be almost obvious at this point, it is worth repeating for reenforcement.
Based on the results of the article with Aris and Nemhauser, Wilde (8) formulated several rules for simplifying a system to make a dynamic programming optimization plan more efficient. These are as follows: Rule 1. Irrelevant Stages
It is not necessary to consider a stage that does not affect the return function. An example could be a waste treatment step at the end of the process where the cost of operation is equal to the sales of product recovered, i.e., break-even. Rule 2. Stage Combination
Because an exhaustive search is required on the state variables, an overall savings of effort is obtained when state variables are eliminated, even at the expense of obtaining more decision variables. Multivariable search techniques can be applied to decision variables. This leads to the following corollary. Corollary 2 Decisionless Stages
Applying this rule reduces the number of state and decision variables. For a final value problem, decision inversion transforms a decision into an output that is completely specified by the input state variable. Rule 4 Small Loops
If this rule is
applied to cyclic optimizations for a system with three decision variables,
one of these is eliminated by cutting the state. Then decision inversion
is performed and stages one and two are combined. Thus, a one-state, one-decision
partial optimization is performed at stage one-two and a no-state, one-decision
partial optimization is performed at stage three. This procedure is repeated,
searching on s3 = Rule 5. Cut State Location
This rule converts loops into diverging branches, which are easier to optimize than are converging branches. Further, a single variable search can be performed on the state variable.
At this point we have dealt with the theory of dynamic programming and simple examples to illustrate the use of the theory. Now, we will describe an application to an industrial process done by Lowry(9). In applying dynamic programming to the contact process for sulfuric acid manufacture, Lowry(9) demonstrated the capability of this procedure to optimize an established industrial process. The details of this work are given by Lowry(9), and they include the detailed process description, material and energy balance equations and the related chemical equilibrium calculations with transport and thermodynamic properties. This study will be summarized with a brief description of the process, and an analysis of the logic required to convert the process flow diagram into the dynamic programming functional diagram. Also a summary of the results obtained by the optimization will be presented.
The contact process produces 98% sulfuric acid as a primary product and process steam as a secondary product as shown in the process flow diagram given in Figure 7-21. Both products are usually consumed in adjacent plants. For this study the sulfur feed rate was set at 10,000 lb/hr, which corresponds to a standard industrial size plant. The sulfur is burned to form sulfur dioxide with air that has been dried with 98% sulfuric acid. The reaction is exothermic and goes to completion. Excess air is supplied to provide sufficient oxygen to react the sulfur dioxide to sulfur trioxide in the two converters. In the oxidation of sulfur dioxide to sulfur trioxide, the reaction rate increases with temperature, and the equilibrium conversion decreases with temperature. Therefore, two converters are used, and the first converter is operated in a higher temperature range to take advantage of the increased rate of reaction. The second converter is operated in a lower temperature range to obtain an increased conversion. The temperature to each converter is controlled by a waste heat boiler which produces process steam. The hot gas from the burner enters waste-heat boiler 1, and steam is produced by cooling the gas which enters converter A. Partial oxidation of sulfur dioxide to sulfur trioxide takes place in the converter which has a vanadium catalyst. Due to the exothermic reaction, the temperature of the gas increases in the converter. Then the gas enters waste-heat boiler 2, and additional steam is produced. From the boiler the gas flows to converter B to have essentially all of the sulfur dioxide converted to sulfur trioxide. From the converter the gas goes to the economizer where energy is recovered by heating water to its saturation temperature for use in the two waste-heat boilers. Also, the gas is cooled to the temperature required for the absorber. In the absorber the sulfur trioxide is converted to sulfuric acid in a packed tower by contacting the gas with 98% sulfuric acid. The other gases in this stream, mainly nitrogen with some oxygen and a trace of sulfur dioxide, are vented to the atmosphere. In the acid cooler make-up water is added to hold the concentration at 98% since there is not enough moisture in the air to supply all of the water required. Heat of reaction in the absorber and heat of dilution in the dryer raise the temperature of the acid, and the acid cooler includes a heat exchanger used to remove the heat from the acid. The acid cooler provides the acid for the dryer and absorber and the 98% acid product for sale.
The dynamic programming analysis begins with each unit in the process being a stage in the functional diagram as shown in Figure 7-22. The rules from the previous section now can be used to develop the final functional diagram. In Figure 7-22 there are nine stages, and the input state variable s9 is the fixed flow rate of sulfur to the burner of 10,000 lb per hour. There is an output at stage 2, s12, which is flow rate of product acid from the acid cooler. The decision variables are d4, the flow rate of water to the economizer; d2, the flow rate of cooling water to the acid cooler; and d1, the atmospheric air flow rate to the dryer. There are two other decision variables, d6 and d8, which are the flow rates of water to the two waste heat boilers. However, their sum must equal d4, the flow rate of water to the economizer. Also, there are two recycle streams. One is from the economizer to the waste heat boilers, and the other is from the dryer to the burner. The following paragraphs will describe one way of simplifying the functional diagram to one that has one state and decision variable per stage. This was obtained after a number of trials. Converting the process flow diagram to the dynamic programming functional diagram is the most difficult step in the optimization procedure. Selecting a variable to be either state or decision is somewhat arbitrary as is the way stages are combined. Many combinations have to be tried, and a final optimization plan will emerge that is probably not unique. The goal is to obtain a computationally efficient set of functional equations to be used for the optimization. First, it was necessary to deal with the recycle stream in the process involving the water flow rates from the economizer to the boilers. The decision variables, d4, d6 and d8, were allowed to be independent initially. An interim cost was assigned to each of these streams to account for their values in the return functions at stage 4, 6 and 8. In the final results the sum of the flow rates to the boilers must be essentially equal to that for the economizer. At the optimum a check was made to ensure that the sum of the inputs to the boilers was essentially equal to the output from the economizer. Additional simplifications were made as shown in Figure 7-23. Decisionless stages were combined with adjacent stages according to the rules previously discussed. The burner was decisionless, and it was combined with waste-heat boiler 1. Converter A was decisionless, and it was combined with waste-heat boiler 1 also. Converter B was decisionless, and it was combined with waste-heat boiler 2. The absorber was decisionless, and it was combined with the acid cooler. There are five stages in the functional diagram with each one having one input and one decision as shown in Figure 7-23. The recycle loop between the burner and the dryer makes the system cyclic. The loop was eliminated by cutting the state between the dryer and the burner, and the system became a serial one as shown in Figure 7-24. By cutting the state, the output from the dryer was fixed, and decision inversion was required. The dryer was combined with the acid cooler-absorber stage. However, a closer analysis revealed that cutting the state on the loop required specifying two state variables, the dry air temperature and flow rate. Consequently, the two decisions d1 and d2, the atmospheric air flow rate and the cooling water flow rate were converted to computed outputs. Thus, the dryer-acid cooler-absorber stage had to be combined with the economizer stage. The functional
diagram shown in Figure 7-24
is the final form. It is a three-stage, initial value, serial problem.
A two variable search was required on the cut state values of the dry
air temperature, At stage two the
input state variable s2(= At stage one, the
input state variable s1(= The partial optimizations at each stage were performed by Lowry(9), and detailed results were obtained for the optimum operating conditions for the process. The strategy just described is the one that resulted after considering many possible ways to formulate the optimization problem. It was found that there was no substitute for a detailed understanding of the process to be able to obtain a successful solution. For example, the first problem encountered was that of the interior loop. The procedure used to allow each stream to be independent gave an effective dynamic programming analysis. The constraint was satisfied as the solution proceeded on the outer loop. Decisionless stages were combined with adjacent stages; and when a choice of adjacent stages existed, the one that had the smallest number of state variables was selected. However, in this study it was necessary to make arbitrary choices at times. These choices affected the complexity of the final functional diagram, and the only way to determine the effect of a particular choice was to complete the analysis. For example, in this process there were several combinations for each decisionless stage. This required a number of plans to be devised and rejected before the final plan emerged.
A detailed discussion of the dynamic programming optimization results was given by Lowry(9). A summary of these results is given in Table 7-2. The maximum return was found to be $230.57 per hour and was obtained for a dry air flow rate of 135,000 lb/hr and temperature of 430K. The optimal operating conditions necessary to achieve this return are shown in the table also. The highest value of the gas temperature, 1000K, was specified from converter A to maximize steam production. The lowest value of the gas temperature was specified from converter B to maximize the conversion of SO2 to SO3. The gas temperatures entering both converters were at constraints. The inlet gas temperature to the first converter was at the ignition temperature of the catalyst, and the exit temperature of the second converter had to be equal to or less than 1000K. Above this temperature the catalyst degrades rapidly. The tabular information
for the partial optimizations at stages one through three is given in
Table 7-3. As indicated the
table, the overall optimum required an output from stage 3 of Optimal operating
policies for selected cut-state values of the dry air flow rate are given
in Table 7-4. Total conversion
|