694 5 Safety and Risk i n Engineering Design d) Systems Analysis with GAs and Fault Trees Commonly with mathematical optimisation pr oblems, such as linear programming, dynamic programming and various other optimisation techniques, an explicit objec- tive function is derived that defines how the characteristic to be minimised is related to the variables. However, in many design optimisation problems, an explicit objec- tive function cannot be formulated, and system performance is assessed by fault-tree analysis (FTA). This is often the case in safety systems design. The nature of the de- sign variables also adds difficulty to the problem. Design variables that represent the levels of duplication for fully or partially redundant systems, as well as the period between preventive maintenance, are all integer. Furthermore, selecting component types is governed by Boolean variables, i.e. selection or non-selection. A numerical scheme is therefore required that produces integer values for these variables, since it will not b e appropriate to utilise a method where real numbers are rounded to the nearest whole number. Constraints involved in this problem fall into the category of either explicit or implicit constraints. Expected maintenance downtime, for example, can be represented by an explicit function of the design pa- rameters; however, the number of spurious process trips can be assessed only via a full analysis of the system, which will again require employment of the fault-tree analysis methodology. As no explicit objective function exists for most preliminary designs of safety systems, particularlyin redundancy allocation problems for design optimisation, fault trees are used to quantify system unreliability and/or unavailabil- ity of each potential design. It is, however, a time-consuming and impractical task to construct a fault tree for each design variation, especially at the lower systems hierarchy levels. To resolve this difficulty, select events can be used to enable the constructio n of a single fault tree cap able of representing causes of the system fail- ure mode for each possible system d esign. Select events in the fault tree, which are either TRUE or FALSE, are utilised to switch on o r off different bran c hes to model the changes in the causes of failure for each design alternative. As an example, consider the choice of a valve type, from the possible alternative valves V 1 , V 2 or V 3 in a safety system (Pattison et al. 1999). The fault tree is shown in Fig. 5.50. If valve type V 1 is selected, the select event H 1 corresponding to the selection of this valve is set to TRUE. Select events H 2 and H 3 , corresponding to the selection of V 2 and V 3 , are conversely set to FALSE. A contribution to the top event arises from the left-most branch only. The two right-most branches are, in effect, switched off. Levels of redundancy are handled similarly. Furthermore, the spurious trip fre- quency for each design is an implicit constraint that requires the use of fault-tree analysis to assess its value. Select events are again used to construct a fault tr ee capable of representing each potential design for this failure mode. 5.3 Analytic Development of Safety and Risk in Engineering Design 695 Fig. 5. 50 Fault-tree structure for safety valve selection (Pattison et al. 1999) e) Algorithm Description Using Binary Decision Diagrams A binary decision diagram (BDD) is a type of oriented graph used notably for the description of algorithms. It basically consists of two types of nodes: the decision or test node, and the output node. The decision node is equivalent to an if-then- else instruction that realises a test on a binary variable and, according to this value, indicates the following node. The output node produces a value. There are two rules of BDD assemblage, namely that there is one and only one initial node (the entry point of the algorithm), and that the output point of a node can be connected to only one entry point of another node. More precisely, a BDD is a rooted, directed, acyclic graph with an unconstrained number of in-edges and two out-edges, one for each of the 1 and 0 decision paths o f any given variable. As a result, the BDD has only two terminal nodes, representing the final value of the expression, 1 or 0—although occasionally the zero (false) node and edges leading to it are omitted (Akers 1978; Bryant 1986). To improve efficiency of analysis, the binary decision diagram (BDD) method is used to solve the resulting fault tree. The BDD is composed of terminal and non- terminal vertices that are connected by branches in the diagram. Terminal vertices have the value of either 0 or 1, whereas the non-terminal vertices correspond to the basic events of the fault tree. Each vertex has a 0 branch that represents the basic event of non-occurrence(i.e. it works), and a 1 branchthat represents the basic event of occurrence (i.e. it fails). Thus, all paths through the BDD terminate in one of two states—either a 1 state, which corresponds to system failure, or a 0 state, which corresponds to system success. The BDD represents the same logical function as the fault tree from which it is developed; however,the BDD producesmore accurate results. As an example, consider the BDD illustrated in Fig. 5.51. The fault-tree structures for each system failure mode are converted to their equivalent BDD. Analysis of a BDD has proven to be more efficient than the quan- tification of the fault-tree structure because evaluation of the minimal cut sets for 696 5 Safety and Risk i n Engineering Design Fig. 5.51 Binary decision diagram (BDD) for safety valve selection quantification is not required. For the purpose of BDD construction, select events in the fault tree are treated as basic events. Using this process, the fault tree for the component design variables that is shown in Fig. 5.50 is represented by the BDD in Fig. 5.51. The quantity q appearing on the 1 and 0 branches developed from each node in Fig. 5.51 represents the probability of each path. The select events are turned on or off by setting their probability to 1 or 0 respectively. Consider, for example, the design where valve 1 has been selected for the fault tree shown in Fig. 5.50. This is then presented by S 1 = 1, S 2 = 0, S 3 = 0fortheselect events and, hence, the corresponding probabilities q S 1 = 1, q S 2 = 0andq S 3 = 0 are set on the equivalent BDD. Theonly path to a terminal 1 node leavesV 1 and S 1 on their 1 branches, which have probability q V 1 . The probability values assigned to each select event, which are determined by a particular design, are automatically assigned to the BDD. Thus, the major advantage of the BDD is its practicality. f) Example of Genetic Algorithm Application As an example, the BDD methodology is applied to a high-pressure protection sys- tem. The example is taken from Sect. 5.2.4.2 dealing with the structuring of the cause-consequencediagram, in which the CCD diagrammingtechniquewas applied to th e simple high-pressure protection system as depicted in Fig. 5.34. The features of this high-integrity protection system (HIPS) are shown in Fig. 5.52. The function of the protection system is to prevent a high-pressuresurge originat- ing from process circulation pump s, to protect equipment located downstream of the process. Returning to the previous example, the basic functions of the components of the system are shown in Table 5.1. The first level of protection is the emergency shutdown (ESD) sub-system with its specific pressure control valves (PCVs). Pres- sure in the pipeline is monitored using pressure transmitters (P 1 , P 2 and P 3 ). When 5.3 Analytic Development of Safety and Risk in Engineering Design 697 Fig. 5. 52 High-integrity protection system (HIPS): example of BDD application Table 5.24 Required design criteria and variables Design criteria Design variable How many ESD valves are r equired? (0, 1 , 2) E How many HIPS valves are required? (0, 1, 2) H How many pressure transmitters for each sub-system? (1, 2, 3, 4) N 1 , N 2 How many transmitters are required to trip? K 1 , K 2 Which ESD/HIPS valves should be selected? V Which pressure transmitters should be selected? P What should the m aintenance interval be for each sub-system? θ 1 , θ 2 the pipeline pressure exceeds th e permitted value, the ESD system acts to close the main PCV (V 1 ) and sub-PCV (V 2 ), together with the ESD valve (V 3 ). To provide an additional level of protection for high integrity, a second level of redundancy is incorporated by inclusion of a high-integrity protection system (HIP sub-system). This works in a similar manner to that of the ESD system but is completelyindepen- dent in operation with its specific pressure control valves, HIPS V 1 (V 4 )andHIPS V 2 (V 5 ). Even with a relatively simple system such as this, there are a vast number of options for the engineering designer to consider. In this example, it is required to determine values for the design variables given in Table 5.24. Several constraints have been placed on the design criteria, as follows: • The total system cost must be minimised. • Spurious system shutdowns would be unacceptable if this was more than once per year. • The average downtime per year owing to preventive maintenance must be min- imised. Genetic Algorithm Implementation As previously indicated, genetic algorithms (GAs) belong to a class of robust optimisation techniques that use principles mim- icking those of natural selection in genetics. Each individual design at assembly level, and at component level where such components h ave been identified in the preliminary d esign phase, is coded as a string of parameter values where each string is analogous to a chromosome in nature. The GA method is then applied with a pop- ulation of strings, each string being assigned a measure of its fitness. Selection (or 698 5 Safety and Risk i n Engineering Design reproduction, as it is termed in genetics) then exploits this fitness measure. The greater the fitness value, the higher is the string’s chance of being selected for the next generation. The entire process is influenced by the action of the genetic operators—typically, crossover and mutation. Crossover involves crossing information between two so- lution strings that are already selected to enter the n ext generation. Mutation is the alteration of a parameter value on the solution string. Both operators enable explo - ration of different system designs. To specify a safety system design, a value is assigned to each of the ten design variables given in Table 5.24. These values are then expressed in binary form, such as a string of binary digits. Each variable is given a particular length, in order to ac- commodate the largest possible value in binary form. In total, each string represent- ing the de sign variables can be in terpreted as a set of concatenated integers coded in binary form. However, the restricted range of values assigned to each parameter does not in each case correspond to the representative binary range on the solution string. For this reason, a procedure is used to code and, in subsequent generations, to ch eck the feasibility of each string. Evaluating String Fitness Constraints are incorporated into the optimisation by penalising the fitness when they are violated by the design. The fitness of each string consists of four parts (Pattison et al. 1999): 1. Probability of system failure unreliability. 2. Penalty for exceeding the total cost constraint. 3. Penalty for exceeding the total maintenance downtime constraint. 4. Penalty for exceeding the spurious trip constraint. The result is a fitness value for each design, which can be referred to as the pe- nalised system unreliability of design. Calculating this system unreliability involves derivation of the penalty formula for excess cost, spurious trip occurrences, and maintenance downtime. If a particular design exceeds any of the stated limits, the respective penalty is added to the system unreliability of design. The formula used for th e penalty function is described later. The penalised probability of system un- reliability is thus calculated using the following expression Q s = Q s +C P + T P + D P (5.101) where: Q s = penalised probability of system unreliability Q s = un-penalised prob. of system unreliability C P = penalty due to excess cost T P = penalty due to excess spurious trips D P = penalty due to excess maintenance downtime (DT). Derivation of the Penalty Formula If the performance of a design is significantly improved owing to comparatively small excesses in one or more of the constraints, the specific design deserves further consideration. Conversely, excessive abuse of 5.3 Analytic Development of Safety and Risk in Engineering Design 699 the limits with only a small degree of performance improvement implies that the design be discarded. It is essential that an appropriate penalty be applied to the system unreliability when constraints are violated. For example, a spurious trip can affect the reliability of the system and result in a loss of production. For this reason, a spurious trip is expressed in terms of unreliability and cost. This is achieved using a multiplying factor th at, rather than being fixed, varies according to the system unreliability of the design, as indicated in (Eq. 5.102) below. A penalty function based on sub-system unreliability and cost is defined to in- crementally increase the penalty. This requires careful consideration of unreliability and cost minimisation of the design being penalised, where the objective and penalty functions are defined as follows f System = s ∑ i=1 [1−R i (x i )] −C i (x i ) (5.102) where: s = total number of sub-systems x i = decision variable relating to system reliability f System = fitness function for system unreliability and cost R i (x i )=objective function for total reliability of sub-system i C i (x i )=objective function for total cost of sub-system i. In this expression of the fitness function , the relationship between unreliability and excess cost is assumed to be linear. However,although small excesses in cost may be tolerated, as the extra cost becomes much larger its feasibility should significantly decrease. For this reason, an exponential relationship is preferred for the objective function for the total cost of sub-system i, as given in (Eq. 5.102). To illustrate this, consider a particular design in which a base level in system perfor mance is assumed and an unreliability value of 0.02 (i.e. 0 .98 or 9 8% relia- bility) for the system is considered reasonable. Should the cost of a design exceed a certain base level (say, 1,000 units), the excess cost percentage should be reflected in the system unreliability as a corresponding percentage improvement about that base level. If the relationship between unreliability and excess cost is assumed to be linear, a design that costs 1,100 units should show an improvement of at least 0.002 in unreliability (i.e. 10%). However, the multiplying factor of 0.002, or 10% of the base level p erformance, is the area of concern if the value is a fixed percentage of system unreliability. With such a fixed multiplying factor, the penalty formula does not properly take into account system unreliability of comparative designs that are being penalised. To further illustrate this, consider the following example: design A costs 1,100 units and has an un-pen alised system unreliability of 0.02 (reliability of 0.98 or 98%). The objective function for total system cost is given as the exponential re- lationship of the ratio of total system costs to a base cost of 1,000 units, which is modified by the power 5/4 (Pattison et al. 1999). 700 5 Safety and Risk i n Engineering Design This is expressed as s ∑ i=1 C i (x i )= [C s ] C b 5/4 (5.103) Applying the penalty function formula of (Eq. 5.102) then gives the following: f System = 0.02× [1,100] 1,000 5/4 = 0.0225 The cost penalised fitness value is 0.0225, a fitness decrement of approximately 25% compared to the u n-penalised unreliability of 0.02. Design B costs 1,150 units but has an un-penalised system unreliability of 0.015 (i.e. reliability of 0.985 or 98.5%). Applying the penalty for mula gives a c ost pe- nalised fitness value of 0.018, a fitness decrement of approximately 20% compared to the un-penalised unreliability of 0.015. The comparative cost penalty for the fitter string (design A) is thus greater by 5% (25–20%). The difference in un-penalised system reliability between design A and design B is only 0.5%. Thus, the penalty should take the fitness value of the system to be penalised into consideration. Con- sider, therefore, a p articular design with cost C. The percentage excess of the sys- tem’s cost is calculated as X c . The multiplying factor is then derived by calculating X c percent of the system unreliability of the engineeringdesign under consideration. Reproduction probabilities The fitness value, or penalised system unreliability, is evaluated for each string. For the purpose of selection in the GA, each string is assigned a reproduction probability that is directly related to its fitne ss value. In the safety system optimisation problem, the smaller the fitness value, the fitter is the string and, hence, the greater should be its chance of reproduction. For cases such as these, a possible approach is to let the reproduction probability be one minus the fitness value. However, the penalised system unreliability fitness values may result in all reproduction probabilities of a string having similar values, thereby detracting from the specific fitness information available to the GA. A more objective method is required that retains the accuracy of each string’s fitness value during conversion to its corresponding reproduction probability. Converting the fitness value Each design receives a measure of its fitness. This is the design string’s penalised system unreliability. However, this value is not in an appropriate form to be used directly in the selection process of the GA, since the smaller the fitness value, the better is the design. A specialised conversion method is required that gives rise to weighted percentages in accordance with the fitness value of each string. A system with a performance twice as good as that of another should have twice the percentage allocation. One conversion method is to allocate each string to one of three categories ac- cording to its fitness value. Strings in category 1 are automatically given 0 %, as this category consists of poor system designs and these are eliminated from the succeed- ing generation. Strings in category 2 contain relatively unfit designs, and are allo- cated a relative portion up to a total of 5%. The strings that fall into category 3 are of 5.3 Analytic Development of Safety and Risk in Engineering Design 701 ultimate interest. The remaining 95% is then allocated to each string, depending on how much their fitness value exceeds a base limit of 0.1. The p ercentage allocated to each category is fixed and, therefore, independent of the number o f strings that it contains. Problems occur, however, when a very high or a very low proportion of strings fall into a particular category, and an improved method is required that is able to cope with very diverse populations and simultaneously to show sensitivity to a highly fit set of strings. This is done by proportioning the percentage allocation for a category by a weighted distribution of the fitness value of each string in the category and the number of strings it contains. GA parameters The GA requires the following selection parameters to be set: • population size, • crossover rate, • mutation rate and • number of generations. The values entered for these parameters have a marked effect on the action o f the GA and on the penalised sy stem unreliability of the best overall string for eachparameter set. To obtain an indication of th e effect of setting each parameter to a particular value, the penalised system unreliab ility obtained is summed and averaged against results obtained for the mutation rate, crossover rate and population size for the example GA. g) Results of the GA Methodology The simple example GA is a very effective design tool in its applicatio n to the high- pressure protection system shown in Fig. 5.47. The modified cost penalty and the modified conversion method established the preferred GA methodology. This mod- ified GA demonstrates the ability to find and explore the fittest areas of the search space and it is able to differentiate between highly fit strings as the algor ithm pro- gresses, whereby retention of the best design over later generations is achieved. Us- ing the modified GA, the characteristics of the best design obtained for the design variables given in Table 5.24 are represented in Table 5.25. Table 5.25 GA design criteria and variables results Design criteria Design Sub-system variable ESD HIPS How many ESD valves are r equired? (0, 1 , 2) E 0– How many HIPS valves are required? (0, 1, 2) H –2 How many transmitters per sub-system? (0, 1, 2, 3, 4) N 1 , N 2 44 How many transmitters are required to trip? K 1 , K 2 12 702 5 Safety and Risk i n Engineering Design 5.3.3 Analytic Development of Safety and Risk Evaluation in Detail Design The engineering design process presents two fundamental problems: first, most en- gineering systems have complex, non-linear integrative functions; second, the de- sign process is fraught with uncertainty, typically when based on iterative evolu- tionary computational design. This trial and error feedback loop in detail design evaluation needs to be tightened by improving design analysis before the onset of system manufacturing or construction (Suri et al. 1989). Artificial neural networks (ANN) offer feasible solutions to many design prob- lems becau se of their capability to simultaneously relate multiple quantitative and qualitative variables, as well as their ability to form models based solely on minimal data, rather than assumptions of linearity or other static analytic relations. Basically, an artificial neural network is a behaviour model built through a process of learning from forecast example data of the system’s behaviour. The ANN is progressively modified using the example data, to become a usable model that can predict the sys- tem’s behaviour, expressed as relationships between the model’s variables. During the learning process, the network evaluates relationships between the descriptive or explicative data (i.e. network inputs) and the outcome or explained data (i.e. network outputs). The result of the learning process is a statistical model that is able to pro- vide estimates of the likely outcome. The predictive power of the ANN is assessed on test data excluded from the learning process. Because ANNs need training data, experimental results or similar systems data must be available. These, however, are usually limited, as large amounts of data cannot easily be generated in the detail design phase of the engineering design pro- cess. To obtain the best possible ANN models, and to validate results, strategies that maximise learning with sparse data must be adopted. One such method is the ‘leave- k-out’ p rocedure for training (Lawrence 1991). A small number, k, of vectors out of the training vectors are held back each tim e for testing, and networks are trained, changing the k holdbackvectors each time. Since the size of each network is usually modest for product design applications, and the number of training vectors small, training progresses rapidly, and creating multiple networks is not a burde n. Another method for sparse training data is to insert ‘noise’ into the training set, creating multiple variant versions of each training vector. 5.3.3.1 Artificial Neural Network Modelling Predictive artificial neuralnetwork (ANN) modelling can relatemultiple quantitative and qualitative design parameters to system performance. These models enable en- gineering designers to itera tively and interactively test p arameter changes and eval- uate corresponding changes in system performance before a prototype is actually built and tested. This ‘what-if’ modelling ability both expedites and economises on the design process, and eventually results in improved design integrity. ANN models 5.3 Analytic Development of Safety and Risk in Engineering Design 703 can also supplement controlled experiments during systems testing to help ascertain optimum design specifications and tolerances. Artificial neural networks have been successfully applied to develop predictive networks for system performance sensi- tivity studies of the effects of altera tions in design parameters. After translating as many parameters as possible into continuously valued numeric measures, so that alternate designs can be better compared, a ‘leave-k-out’ training procedure is used to develop predictive networks for performance on each of the quality tests, based on the design parameter specifications. A sensitivity model for each neural network is built by changing each design parameter in small increments across its range. Design engineers can thus use the models interactively to test the effects of design changes on system performance. In this way, designs can be optimised for perfor- mance, given manufacturing and cost constraints, before prototype models are built (Ben Brahim et al. 1992). A further use of ANN models in engineering design is for the models to act as an expert sy stem, where rules are learned directly through system instances, rather than defined through knowledge engineering. Artificial neural networks have also been successfully applied in engineering d esign by training a multi-layered network to act as an expert system in designing system mechanical components. The method uses documented design policies, heu ristics, and design computation to construct a rule base (or decision table). The network is then trained on representative examples adhering to this rule base. This approach, which uses neural networks in lieu of expert systems, is advantageous in that rules are learned directly through design examples, rather than through tedious and often problematic knowledge acquisition (Zarefar et al. 1992). A disadvantage of using neuralnetworks in lieu of expert systems, though, is that explanation and tracing through the reasoning process are impossible; the neural networks then act essentially as a black box. The application of expert systems is considered later in g reater detail. The disadvantage, however, of using expert systems on their own is the time re- quired for analysis and formatting, which are increased and not decreased. Experts systems are slow to develop and relatively expensive to update, as well as having fundamental, epistemological problems such as the appropriateness in representing knowledge in the form of decision rules or decision trees. The need to manually update expert systems each time expertise changes is cumbersome, while with arti- ficial neural networks, all that is required is to launch a new learning process. The immense advantage of ANN models in lieu of expert systems is that analysis pro- ceeds directly from factual d ata to the model without any manipulation of example data. Artificial neural networks (ANN) can also be math ematically termed as univer- sal approximations (according to Kolmogorov’s theorem; Kolmogorov 1957), in that they have the ability to represent any function that is either linear or non-linear, simple or complicated. Furthermore, they have the ability to learn from representa- tive examples, by error back propagation.However,artificial neural networks supply answers but not explanations. The ANN model embodies intuitive associations or correlations, not causal relations or explanations. ANN models are predictive (i.e. . strings that fall into category 3 are of 5.3 Analytic Development of Safety and Risk in Engineering Design 701 ultimate interest. The remaining 95% is then allocated to each string, depending. monitored using pressure transmitters (P 1 , P 2 and P 3 ). When 5.3 Analytic Development of Safety and Risk in Engineering Design 697 Fig. 5. 52 High-integrity protection system (HIPS): example of BDD. applied with a pop- ulation of strings, each string being assigned a measure of its fitness. Selection (or 698 5 Safety and Risk i n Engineering Design reproduction, as it is termed in genetics) then